Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numpy should use Apple Accelerate instead of OpenBLAS on Apple Silicon #181251

Open
4 tasks done
vlebert opened this issue Aug 15, 2024 · 3 comments
Open
4 tasks done

Numpy should use Apple Accelerate instead of OpenBLAS on Apple Silicon #181251

vlebert opened this issue Aug 15, 2024 · 3 comments
Labels
bug Reproducible Homebrew/homebrew-core bug help wanted Task(s) needing PRs from the community or maintainers

Comments

@vlebert
Copy link

vlebert commented Aug 15, 2024

brew gist-logs <formula> link OR brew config AND brew doctor output

❯ brew config
HOMEBREW_VERSION: 4.3.15
ORIGIN: https://github.com/Homebrew/brew
HEAD: fa53e7b1e51a2deb7ec5a1e12452a1182dc342f7
Last commit: 3 days ago
Core tap JSON: 15 Aug 08:25 UTC
Core cask tap JSON: 15 Aug 08:25 UTC
HOMEBREW_PREFIX: /opt/homebrew
HOMEBREW_CASK_OPTS: []
HOMEBREW_MAKE_JOBS: 10
Homebrew Ruby: 3.3.4 => /opt/homebrew/Library/Homebrew/vendor/portable-ruby/3.3.4_1/bin/ruby
CPU: 10-core 64-bit arm_firestorm_icestorm
Clang: 15.0.0 build 1500
Git: 2.39.3 => /Library/Developer/CommandLineTools/usr/bin/git
Curl: 8.6.0 => /usr/bin/curl
macOS: 14.5-arm64
CLT: 15.3.0.0.1.1708646388
Xcode: N/A
Rosetta 2: false

❯ brew doctor
Please note that these warnings are just used to help the Homebrew maintainers
with debugging if you file an issue. If everything you use Homebrew for is
working fine: please don't worry or file an issue; just ignore this. Thanks!

Warning: Some installed formulae are deprecated or disabled.
You should find replacements for the following formulae:
  [email protected]

Verification

  • My brew doctor output says Your system is ready to brew. and am still able to reproduce my issue.
  • I ran brew update and am still able to reproduce my issue.
  • I have resolved all warnings from brew doctor and that did not fix my problem.
  • I searched for recent similar issues at https://github.com/Homebrew/homebrew-core/issues?q=is%3Aissue and found no duplicates.

What were you trying to do (and why)?

Initially I used miniconda for python on my Mac M1 but now, python provided by brew is compatible with Apple Silicon.

I am trying to use homebrew python on my mac

What happened (include all command output)?

Numpy is installed as a dependency for different packages on my system, however provided installation is not optimized for apple silicon.

see numpy/numpy#24961

On my machine :

> python3
> import numpy
> numpy.show_config()

...
"Build Dependencies": {
    "blas": {
      "name": "openblas",
      "found": true,
      "version": "0.3.27",
      "detection method": "system",
      "include directory": "unknown",
      "lib directory": "unknown",
      "openblas configuration": "unknown",
      "pc file directory": "unknown"
    },

...
> python3 -m venv .env
> source .env/bin/activate
> pip install numpy
> python
> import numpy
> numpy.show_config()

...
"Build Dependencies": {
    "blas": {
      "name": "accelerate",
      "found": true,
      "version": "unknown",
      "detection method": "system",
      "include directory": "unknown",
      "lib directory": "unknown",
      "openblas configuration": "unknown",
      "pc file directory": "unknown"
    },
...

Running performance test is up to 10 time more efficient in my .env than in my system. I suppose this can slow up all packages depending on system numpy

What did you expect to happen?

Numpy provided by brew should be compiled with Apple Silicon optimizations.

Step-by-step reproduction instructions (by running brew commands)

See previous section
@vlebert vlebert added the bug Reproducible Homebrew/homebrew-core bug label Aug 15, 2024
@carlocab carlocab added the help wanted Task(s) needing PRs from the community or maintainers label Aug 15, 2024
@carlocab carlocab changed the title Numpy provided by brew is not optimized for Apple Silicon Numpy should use Apple Accelerate instead of OpenBLAS on Apple Silicon Aug 15, 2024
@cho-m
Copy link
Member

cho-m commented Aug 15, 2024

Accelerate update in macOS 13.3 could allow us to reconsider https://docs.brew.sh/Formula-Cookbook#linear-algebra-libraries (perhaps from Sonoma onward in case user is on 13.0 - 13.2).

This would require documentation change and switching all formulae to Accelerate for Sonoma onward.


May also be worth exploring optimization opportunities in OpenBLAS if anyone wants to try tuning it.


EDIT:

Running performance test is up to 10 time more efficient in my .env than in my system.

Just noting that I've noticed worse SVD performance with whatever default OMP threads is (maybe total P+E cores). Still worse after adjusting than Accelerate but not as bad as 10x.

@cho-m
Copy link
Member

cho-m commented Aug 21, 2024

May also try experimenting with FlexiBLAS, which allows runtime selection. Fedora decided to switch in Fedora 33 (https://fedoraproject.org/wiki/Changes/FlexiBLAS_as_BLAS/LAPACK_manager) though it does restrict some licensing due to LGPLv3

EDIT: FlexiBLAS formula request - #181938

@cho-m
Copy link
Member

cho-m commented Aug 21, 2024

Sample results. venv is PyPI numpy wheel. Others are locally rebuilt brew numpy linked to FlexiBLAS:

hyperfine --runs 2 'venv/bin/python3 mysvd.py' 'FLEXIBLAS=APPLE python3.12 mysvd.py' 'FLEXIBLAS=OPENBLASOPENMP python3.12 mysvd.py' 'FLEXIBLAS=OPENBLASOPENMP OMP_NUM_THREADS=4 python3.12 mysvd.py'
Benchmark 1: venv/bin/python3 mysvd.py
  Time (mean ± σ):     11.344 s ±  0.977 s    [User: 10.263 s, System: 3.067 s]
  Range (min … max):   10.653 s … 12.034 s    2 runs

Benchmark 2: FLEXIBLAS=APPLE python3.12 mysvd.py
  Time (mean ± σ):     10.436 s ±  0.129 s    [User: 10.094 s, System: 3.084 s]
  Range (min … max):   10.344 s … 10.527 s    2 runs

Benchmark 3: FLEXIBLAS=OPENBLASOPENMP python3.12 mysvd.py
  Time (mean ± σ):     67.164 s ±  1.838 s    [User: 133.651 s, System: 36.703 s]
  Range (min … max):   65.865 s … 68.464 s    2 runs

Benchmark 4: FLEXIBLAS=OPENBLASOPENMP OMP_NUM_THREADS=4 python3.12 mysvd.py
  Time (mean ± σ):     27.695 s ±  0.139 s    [User: 26.499 s, System: 9.204 s]
  Range (min … max):   27.596 s … 27.793 s    2 runs

Summary
  FLEXIBLAS=APPLE python3.12 mysvd.py ran
    1.09 ± 0.09 times faster than venv/bin/python3 mysvd.py
    2.65 ± 0.04 times faster than FLEXIBLAS=OPENBLASOPENMP OMP_NUM_THREADS=4 python3.12 mysvd.py
    6.44 ± 0.19 times faster than FLEXIBLAS=OPENBLASOPENMP python3.12 mysvd.py

EDIT: Also a bit sad that OpenBLAS is worse than NETLIB. As previously mentioned, OpenMP threads is maybe a reason

FLEXIBLAS=NETLIB python3.12 mysvd.py
mean of 10 runs: 1.58209sFLEXIBLAS=OPENBLASOPENMP OMP_NUM_THREADS=1 python3.12 mysvd.py
mean of 10 runs: 1.74133s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Reproducible Homebrew/homebrew-core bug help wanted Task(s) needing PRs from the community or maintainers
Projects
None yet
Development

No branches or pull requests

3 participants