Readme
3 min readEdit on GitHub
First make the script executable
FOR MAC
- Run the following commands
hljs bash
chmod +x <Script_Path_Here>
#example
chmod +x ./install_hadoop_spark_mac.sh- then run the following command to execute it
hljs bash
./install_hadoop_spark_mac.shAfter Installation
text
---------------------------------------------
Installation Complete!
---------------------------------------------
Hadoop UI: http://localhost:9870
YARN UI: http://localhost:8088
---------------------------------------------
Run next time:
start-dfs.sh
start-yarn.sh
---------------------------------------------Troubleshooting
If it doesn't run or displays errors, try:
- Enable Remote Login with Full Disk Access:
- Go to Settings > General > Sharing > Remote Login
- Allow full disk access to terminal
- Restart the terminal
- Setup SSH:
hljs bash
ssh localhostType
yes and hit enter- Start Hadoop Services:
hljs bash
start-dfs.sh
start-yarn.shIt'll work fine now.
FOR WINDOWS
Step 1: Install WSL (Windows Subsystem for Linux)
- Open PowerShell as Administrator and run:
hljs powershell
wsl --install-
Restart your computer when prompted
-
After restart, WSL will automatically open and ask you to create a Ubuntu user account
Step 2: Install Ubuntu (if not already installed)
- Open PowerShell and check installed distributions:
hljs powershell
wsl --list --verbose- If Ubuntu is not installed, install it:
hljs powershell
wsl --install -d UbuntuStep 3: Run the Installation Script
-
Open Ubuntu (WSL) from Start Menu or run
wslin PowerShell -
Navigate to the scripts directory:
hljs bash
cd /mnt/c/Users/YourUsername/Downloads/big_data_course/Scripts- Make the script executable:
hljs bash
sudo chmod +x install_hadoop_spark_ubuntu.sh- Run the script:
hljs bash
./install_hadoop_spark_ubuntu.shStep 4: After Installation
text
============================================
Hadoop + Spark Installation Complete!
============================================
HADOOP SERVICES:
NameNode UI : http://localhost:9870
YARN ResourceMgr : http://localhost:8088
SPARK:
Spark Home : /opt/spark
Spark Version : 4.0.1 (Hadoop 3 compatible)
HADOOP USER:
Username : hadoop
Password : hadoop
Switch to user : sudo su - hadoop
TO TEST SPARK WITH HADOOP:
1. Switch to hadoop user: sudo su - hadoop
2. Start spark-shell: spark-shell
3. Test HDFS connection:
scala> val data = sc.textFile("hdfs:///data/data.csv")
scala> data.take(5)
Note: Make sure HDFS is running and you have data uploaded
============================================Troubleshooting for Windows/WSL
- If WSL is not recognized:
- Make sure Windows is updated to version 1903 or higher
- Run
wsl --updatein PowerShell as Administrator
- If you can't access the script:
- Windows drives are mounted at
/mnt/in WSL - C: drive is at
/mnt/c/ - Adjust the path according to where you saved the files
- If services don't start:
- run
hdfs namenode -format restart the terminal- Switch to hadoop user:
sudo su - hadoop - Start services manually:
hljs bash
start-dfs.sh
start-yarn.sh