First make the script executable

FOR MAC

Run the following commands

hljs bash

chmod +x <Script_Path_Here>
#example
chmod +x ./install_hadoop_spark_mac.sh

then run the following command to execute it

hljs bash

./install_hadoop_spark_mac.sh

After Installation

text

---------------------------------------------
 Installation Complete! 
---------------------------------------------
Hadoop UI: http://localhost:9870
YARN UI: http://localhost:8088
---------------------------------------------
Run next time:
 start-dfs.sh
 start-yarn.sh
---------------------------------------------

Troubleshooting

If it doesn't run or displays errors, try:

Enable Remote Login with Full Disk Access:

Go to Settings > General > Sharing > Remote Login
Allow full disk access to terminal
Restart the terminal

Setup SSH:

hljs bash

ssh localhost

Type yes and hit enter

Start Hadoop Services:

hljs bash

start-dfs.sh
start-yarn.sh

It'll work fine now.

FOR WINDOWS

Step 1: Install WSL (Windows Subsystem for Linux)

Open PowerShell as Administrator and run:

hljs powershell

wsl --install

Restart your computer when prompted
After restart, WSL will automatically open and ask you to create a Ubuntu user account

Step 2: Install Ubuntu (if not already installed)

Open PowerShell and check installed distributions:

hljs powershell

wsl --list --verbose

If Ubuntu is not installed, install it:

hljs powershell

wsl --install -d Ubuntu

Step 3: Run the Installation Script

Open Ubuntu (WSL) from Start Menu or run wsl in PowerShell
Navigate to the scripts directory:

hljs bash

cd /mnt/c/Users/YourUsername/Downloads/big_data_course/Scripts

Make the script executable:

hljs bash

sudo chmod +x install_hadoop_spark_ubuntu.sh

Run the script:

hljs bash

./install_hadoop_spark_ubuntu.sh

Step 4: After Installation

text

============================================
 Hadoop + Spark Installation Complete! 
============================================

HADOOP SERVICES:
 NameNode UI : http://localhost:9870
 YARN ResourceMgr : http://localhost:8088

SPARK:
 Spark Home : /opt/spark
 Spark Version : 4.0.1 (Hadoop 3 compatible)

HADOOP USER:
 Username : hadoop
 Password : hadoop
 Switch to user : sudo su - hadoop

TO TEST SPARK WITH HADOOP:
 1. Switch to hadoop user: sudo su - hadoop
 2. Start spark-shell: spark-shell
 3. Test HDFS connection:
 scala> val data = sc.textFile("hdfs:///data/data.csv")
 scala> data.take(5)

Note: Make sure HDFS is running and you have data uploaded
============================================

Troubleshooting for Windows/WSL

If WSL is not recognized:

Make sure Windows is updated to version 1903 or higher
Run wsl --update in PowerShell as Administrator

If you can't access the script:

Windows drives are mounted at /mnt/ in WSL
C: drive is at /mnt/c/
Adjust the path according to where you saved the files

If services don't start:

run hdfs namenode -format
restart the terminal
Switch to hadoop user: sudo su - hadoop
Start services manually:

hljs bash

start-dfs.sh
start-yarn.sh