Skip to content

Incompatible Architecture #34

@swetepete

Description

@swetepete

I am using a 2021 iMac with the Apple M1 chip and macOS Monterey 12.4.

So far to set up PySpark I have pip3 installed pyspark, plus cloned this repo and installed from the requirements.txt file, plus downloaded Java from their homepage. I'm using Python 3.8.9.

I added the path to the pip3 installation of pyspark to SPARK_HOME in my .zshrc and sourced it:

% echo $SPARK_HOME
/Users/julius/Library/Python/3.8/lib/python/site-packages/pyspark

I then executed the following command:

$SPARK_HOME/bin/spark-submit ./server_count.py \
	--num_output_partitions 1 --log_level WARN \
	./input/test_warc.txt servernames

I had to execute this from inside the cc-pyspark repo, otherwise the script could not find the program server_count.py.

It returns this error message:

julius@Juliuss-iMac cc-pyspark % $SPARK_HOME/bin/spark-submit ./server_count.py \
        --num_output_partitions 1 --log_level WARN \
        ./input/test_warc.txt servernames
Traceback (most recent call last):
  File "/Users/julius/cc-pyspark/server_count.py", line 1, in <module>
    import ujson as json
ImportError: dlopen(/Users/julius/Library/Python/3.8/lib/python/site-packages/ujson.cpython-38-darwin.so, 0x0002): tried: '/Users/julius/Library/Python/3.8/lib/python/site-packages/ujson.cpython-38-darwin.so' (mach-o file, but is an incompatible architecture (have 'arm64', need 'x86_64'))
22/07/06 15:04:13 INFO ShutdownHookManager: Shutdown hook called
22/07/06 15:04:13 INFO ShutdownHookManager: Deleting directory /private/var/folders/xv/yzpjb77s2qg14px8dc7g4m_80000gn/T/spark-80c476e9-b5ba-4710-b292-e367dd387ece

There's something wrong with my installation of "ujson", it is for arm, but PySpark is designed for x86? Is that correct?

What is the simplest way to fix this issue? Should I try to run PySpark in some kind of x86 emulation like Rosetta? Has PySpark not been designed for the M1 Chip?

Is there a chance this is the fault of my Java installation? I took the first one offered; it seemed to say x86, but when I tested running PySpark on its own, it seemed to work fine.

Thanks very much

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions