Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Am I using LazyArrays correctly? It appears slower and allocating more memory than regular Arrays. #217

Open
ericqu opened this issue Aug 18, 2022 · 1 comment

Comments

@ericqu
Copy link

ericqu commented Aug 18, 2022

I am trying to reduce memory allocation for matrix multiplication in my project.
I have a simple (non-lazy) implementation and naively thought I could reduce the allocation impact using the LazyArrays magic wand.
It does not seem to work immediately. I guess the most probable case is that I did not use it properly; if so, please let me know how to fix my code. Alternatively, I was maybe overly optimistic, and this is not a good use case for LazyArrays; I would appreciate your feedback. And maybe it is because I run the code on an M1, and somehow it can't benefit from a BLAS/LAPACK.

the version is LazyArrays v0.22.11.

Julia Version 1.7.2
Commit bf53498635 (2022-02-06 15:21 UTC)
Platform Info:
  OS: macOS (arm64-apple-darwin21.2.0)
  CPU: Apple M1
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-12.0.1 (ORCJIT, cyclone)
Environment:
  JULIA_EDITOR = code
  JULIA_NUM_THREADS = 

and the code to reproduce the issue is:

using LinearAlgebra, BenchmarkTools
using LazyArrays

MXs = [1.0 1.0 1.0
    1.0 2.0 1.0
    1.0 3.0 1.0
    1.0 1.0 -1.0
    1.0 2.0 -1.0
    1.0 3.0 -1.0]

MY1 = [1.0
    3.0
    3.0
    2.0
    2.0
    1.0]

function normeq(x, y)
    nn, p = size(x)
    if ndims(y) == 2
        n, ypp = size(y)
    else
        ypp = 1
    end
    back = ypp - 1

    xy = [x y]
    xytxy = xy' * xy

    return xytxy[1:p, end-back:end]
end

function l_normeq(x, y)
    nn, p = size(x)
    if ndims(y) == 2
        n, ypp = size(y)
    else
        ypp = 1
    end
    back = ypp - 1

    xy = ApplyArray(hcat, x, y)

    xytxy = ApplyArray(*, xy', xy)
    m_xytxy = Matrix{Float64}(undef, 4, 4)
    copyto!(m_xytxy, xytxy)

    return m_xytxy[1:p, end-back:end]
end

normeq(MXs, MY1)
l_normeq(MXs, MY1)

@btime normeq(MXs, MY1)
@btime l_normeq(MXs, MY1)

I obtain the following results:

  184.997 ns (3 allocations: 528 bytes)
  11.875 μs (705 allocations: 15.97 KiB)

I should add that I tried with larger arrays (the small arrays are only for simplicity), but the results go in the same direction.

@ericqu ericqu changed the title Am using LazyArrays correctly? It appears slower and allocating more memory than regular Arrays. Am I using LazyArrays correctly? It appears slower and allocating more memory than regular Arrays. Aug 18, 2022
@putianyi889
Copy link
Contributor

The last steps

return xytxy[1:p, end-back:end]
return m_xytxy[1:p, end-back:end]

Allocate and generate concrete arrays. Have you tried @view?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants