Skip to content

Conversation

@matthiasdiener
Copy link
Collaborator

@matthiasdiener matthiasdiener commented Aug 31, 2022

@matthiasdiener matthiasdiener self-assigned this Aug 31, 2022
@matthiasdiener matthiasdiener requested a review from inducer August 31, 2022 03:02
Copy link
Owner

@inducer inducer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good generally. Two style nits below.

Two questions:

  • Does it work?
  • How come it's marked as draft?


with ProcessLogger(logger, f"generate_loopy for '{prg_id}'"):
import pyopencl as cl
dev = self.actx.context.devices[0]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Safer to get it from the queue, which only has one device.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 1e54ce4

and cl.characterize.has_coarse_grain_buffer_svm(dev)):
limit = dev.max_parameter_size
# Leave some extra space since our sizes are estimates
target = lp.PyOpenCLTarget(limit_arg_size_nbytes=limit//2)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, upon second thought: We can only pass this if we're sure that the memory allocated is actually SVM. So this has to get involved with memory pool creation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do this here or as part of inducer/loopy#642 ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in faba326 and 61038f9

@matthiasdiener
Copy link
Collaborator Author

matthiasdiener commented Sep 7, 2022

  • Does it work?

As far as I can tell, yes.

  • How come it's marked as draft?

This still needs the other PRs to be merged first I think.

@matthiasdiener matthiasdiener marked this pull request as ready for review September 13, 2022 13:37
@matthiasdiener matthiasdiener changed the title LazilyPyOpenCLCompilingFunctionCaller: limit arg size for GPUs PytatoPyOpenCLArrayContext, use SVM allocator if available, limit arg size for GPUs Sep 13, 2022
@matthiasdiener matthiasdiener changed the title PytatoPyOpenCLArrayContext, use SVM allocator if available, limit arg size for GPUs PytatoPyOpenCLArrayContext: use SVM allocator if available, limit arg size for GPUs Sep 13, 2022
@matthiasdiener
Copy link
Collaborator Author

This is ready for another review @inducer

@inducer inducer enabled auto-merge (squash) September 19, 2022 22:07
@inducer
Copy link
Owner

inducer commented Sep 19, 2022

Thanks!

@inducer inducer merged commit 3c9aee6 into main Sep 19, 2022
@inducer inducer deleted the limit-arg-size branch September 19, 2022 22:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants