Skip to content

Comments

Add new copy of 005 for review purposes.#83

Open
ccoutant wants to merge 1 commit intomainfrom
dev
Open

Add new copy of 005 for review purposes.#83
ccoutant wants to merge 1 commit intomainfrom
dev

Conversation

@ccoutant
Copy link
Owner

Fresh copy of 005 so that comments can be made on this pull request with the whole document in view.

use to cases where the location described is final, and not subject to
some further modification, with two exceptions. First, if the location
description is a memory location description, it is a simple DWARF
expression (Section 2.5) that can be modified by further DWARF
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is a simple DWARF expression (Section 2.5) that can be modified by further DWARF expression operators.

Just to confirm my understanding of what you are saying here. It can be modified, because a memory location description in DWARF 5 is just a value / number?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Also consider the case where a `DW_OP_call*` operator is used to get the
location of a variable. If the variable happens to be in a register at
the current PC, the call operator cannot succeed, as it cannot push
anything but a memory location on the stack.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's perhaps nitpicking, but I find it a bit misleading to say that a memory location can be pushed on the stack in DWARF 5 (especially with the following paragraph that says that location descriptions can't be pushed on the stack). I would suggest:

"... as it cannot push anything but an address representing the location of the variable in memory"

Most existing arithmetic and logical operators, defined in Section 2.5.1.4,
continue to be limited to operating on values only.

The `DW_OP_deref*` and `DW_OP_xderef*` operators are extended to operate on
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does it work for xderef, which consumes a stack element indicating the memory space? Because of that, I don't think it makes sense for it to operate on locations. It would make no sense to deref a register location while providing an explicit address space number. Even for memory location descriptions, what if the memory location description to deref has address space number 2, but the address space operator of xderef specifies address space number 3?

In the original proposal [1], it keeps the behavior it had in DWARF 5, and is marked deprecated. It consumes a scalar value representing an address and a scalar value representing the address space. I think it would make sense to define it like that here (the goal being only to keep backwards compat).

[1] https://llvm.org/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.html#a-2-5-4-3-4-special-value-operations

location of the object as implicitly-pushed elements on the stack. The
latter element is now allowed to be any location.

Two new operators, `DW_OP_offset` and `DW_OP_bit_offset`, are introduced
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to preemptively answer the question "why not just use DW_OP_plus"?

or a bit offset.

The composite location operators, `DW_OP_piece` and `DW_OP_bit_piece`,
are redefined to build up a composite location, which is held in the top
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpicking: when appending a piece (unless using the implicit behavior to append an undefined piece), the composite location itself is not the topmost element of the stack. Not sure it matters, but I mention it in case you want to find a better formulation.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a bad edit. I'll fix it.

> object or other entity in memory. On architectures that support
> multiple address spaces, a memory location contains a component that
> identifies the address space (which may be provided by the
> `DW_OP_xderef` operation). A memory location is considered
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds wrong. I don't think that DW_OP_xderef pushes a memory location, I think it pushes a value. Unless you mean that a memory location conceptually briefly exists during the execution of DW_OP_xderef?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the parenthetical remark.

> The `DW_OP_piece` operation takes a single operand, which is an unsigned
> LEB128 number. The number describes the size `S`, in bytes, of the piece
> of the object referenced by the location `A` on the top of the stack. If
> the piece is located in a register, but does not occupy the entire
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the piece is located in a register, but does not occupy the entire register, the placement of the piece within that register is defined by the ABI.

Yeah, so in our ideal world where we treat all storages uniformly, just as sequences of bits, we wouldn't have that. Perhaps a point to discuss.

In DWARF 5, register location descriptions just specified the whole register, so I guess a consumer could interpret: "when you say reg 2, I magically know that you mean those specific bytes of reg 2".

But now that register locations have an offset component, I'm not sure how a consumer is supposed to interpret these:

DW_OP_composite
DW_OP_reg2
DW_OP_piece 2

vs

DW_OP_composite
DW_OP_reg2
DW_OP_offset 1
DW_OP_piece 2

In the first one, the register location has an offset of 0. So I guess that consumers in that specific case (offset == 0) could somehow hand waving decide to apply the same "placement defined by ABI" rules they applied in DWARF 5.

But then, what about the second example? The producer specified an explicit offset, so it would be strange to just ignore it and pick some other bytes.

And then, what if the producer really wants to point at those bytes at offset 0 in register 2? If some magic "placement defined by ABI" rules kick in, it's just not possible.

I recall this presentation from Andreas Arnez, if I recall correctly it's a concrete example of this problem.

https://youtu.be/iQAd5Atlz1s?list=PL_GiHdX17Wtx2Bu1O_bREetZZv4moIaRi&t=2137

And also these threads:

[Dwarf-Discuss] DWARF piece questions
https://www.mail-archive.com/dwarf-discuss@lists.dwarfstd.org/msg00344.html

[gdb] [RFC] DW_OP_piece vs. DW_OP_bit_piece on a Register
https://inbox.sourceware.org/gdb/m3vb6wm86q.fsf@oc1027705133.ibm.com/

If time permits, I'd like to go through them again to understand what impact our proposal would have for the problems he presents.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the point I was trying to get at in my "registers really are different" arguments and my Feb. 27 email about "offsets".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that the piece operations define registers as treated differently. So to me it is not that registers are really different, it is that the piece operation were defined to treat them differently.

In the proposal, it redefined the piece operations to tread all storage kinds the same way. That includes registers and implicits. Then registers are not different.

If we do not want to do that then I advocate we should add new piece operations that act like the proposal, and leave the old ones for legacy reasons. The old ones are not very useful when building expressions incrementally as optimizations are applied that change the storage from one kind to another. Then also do not compose if using DWARF procedures.

> addition of an undefined piece to the existing composite location.
>
> - Otherwise, if the top of the stack `A` is a location, or convertible
> to a location, and the preceding element is not a composite location,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if you want to be explicit here, that "the preceding element is not a composite location" includes the case where "the preceding element doesn't exist", aka "A is the only element in the stack".

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may need a separate case. I was trying to write it to include the case where A is already the only element on the stack.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not in favor of the rule that pops random numbers of entries of the stack. I would much prefer to simply define the expression as invalid. There are many other cases where expressions are stated as being invalid, so why not in this case? It makes for a much simpler formal semantic model that is far easier to reason about.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While discussing this in the meetings, John Del Signore made a point that keeping backwards compatibility with existing expressions was important, to help producers migrate from DWARF 5 to 6 incrementally. That's why we decided to keep those cases in there.

> one or more elements below `A` are popped and discarded until the
> preceding element `B` is a composite location, or until `A` is the
> only element on the stack. If `A` is the only remaining element, a new
> empty composite is inserted before it (as if `DW_OP_composite
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as if DW_OP_composite DW_OP_swap had been processed immediately prior to the piece operation

I understand what you mean, but that might be confusing, because if DW_OP_composite DW_OP_swap had occurred prior to the piece operation, then we wouldn't do all this popping and special case handling.

> DWARF expression stack before the `DW_AT_use_location` description is
> evaluated. The first value pushed is the value of the pointer to member
> object itself. The second value pushed is the location of the
> entire structure or union instance containing the member whose address
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whose address -> whose location?

> a value, the value is implicitly treated as a memory address in the
> default address space, and converted to a memory location. If a value
> is expected, but the result is a memory location in the default
> address space, the address is implicitly converted to a value.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that values are typed, is it worth specifying here something like "... converted to a value of generic type."?

Comment on lines +217 to +223
> The `DW_OP_deref_size` takes a single 1-byte unsigned integral operand
> that specifies the size `S`, in bytes, of the value to be retrieved. The
> operation behaves like the `DW_OP_deref` operation: it
> pops the top stack entry and treats it as a location. The first `S` bytes
> are retrieved from the location, zero extended to the size of an
> address on the target machine, and pushed onto the stack as a value of
> the generic type.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if S is bigger than the size of an address? DWARF-5 says "whose value may not be larger than the size of the generic type". This information seems to have been lost.

Comment on lines +227 to +228
> is a 1-byte unsigned integer that specifies the size `S` of the type
> given by the second operand. The second operand is an unsigned LEB128
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DWARF-5 explicitly says that S is the same as the size of type T. Here it is not so clear that they must match.

> multiple address spaces, a memory location contains a component that
> identifies the address space (which may be provided by the
> `DW_OP_xderef` operation). A memory location is considered
> _unbounded_, as the size of the location storage is only implied by
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, "location storage" is a confusing term, because the storage is not meant to store locations. Can we just say "storage"?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, but I think "storage" is too generic and not obvious we're using a term for something specific. I've thought about other names for this, but the best I've come up with so far is "storage bank". What do you think?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, but I think "storage" is too generic and not obvious we're using a term for something specific. I've thought about other names for this, but the best I've come up with so far is "storage bank". What do you think?

I think it's better than "location storage".

> identifies the address space (which may be provided by the
> `DW_OP_xderef` operation). A memory location is considered
> _unbounded_, as the size of the location storage is only implied by
> the type of object stored at that location.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, isn't the size of the storage (in this case, a memory), bounded by the max value of the address? Or do you mean the location (not the underlying storage) implicitly has a size determined by the type of the object stored there?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed this sentence; it was an attempt to define "unbounded" vs. "bounded", but that concept isn't referenced anywhere else in this proposal. I think it's best left to 004-clarifications-mem.

Comment on lines +334 to +335
Move the contents of Section 2.6.1.1.4 here, replacing the term
"location description" with "location" throughout.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one sentence: "DWARF location descriptions are intended to yield the location of a value rather than the value itself." I think it should not be replaced in this instance.

Don't we need to edit the following text for DW_OP_stack_value more:
"In this form of location description, the DWARF expression
represents the actual value of the object, rather than its location."

to something like:
"In this form of location, the value at the top of the DWARF expression
stack represents the actual value of the object, rather than its location."

There is also
"DW_OP_stack_value operation terminates the expression."
Should this limitation stay? Why not just remove it to have more flexibility? DW_OP_stack_value is essentially the same as DW_OP_implicit_value, except that the contents come from the top of the stack instead of the operand. An implicit value does not say that it should terminate the expression.

Comment on lines +344 to +345
> optimization). The `DW_OP_undefined` operation pushes an undefined
> location onto the stack. A DWARF expression containing no operations or
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand what it meant here by "pushes an undefined location". We are pushing a location of kind "undefined" (similar to "memory", "register", etc. kinds). But I fear that some readers may think an arbitrary location is pushed. I don't have a suggestion to make it better, though. Just noting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants