From 6514c0db97baab60b80b7382c6eabbc60d917b2e Mon Sep 17 00:00:00 2001 From: Joel Torstensson Date: Thu, 11 Jun 2020 13:50:38 +0200 Subject: [PATCH 1/6] feat: add dag-jose format --- README.md | 1 + block-layer/codecs/dag-jose.md | 43 ++++++++++++++++++++++++++++++++++ 2 files changed, 44 insertions(+) create mode 100644 block-layer/codecs/dag-jose.md diff --git a/README.md b/README.md index e382571f..17c76f27 100644 --- a/README.md +++ b/README.md @@ -41,6 +41,7 @@ IPLD can operate across a broad range of content-addressable codecs, including G | [Specification: DAG-CBOR](block-layer/codecs/dag-cbor.md) | [block-layer/codecs/dag-cbor.md](block-layer/codecs/dag-cbor.md) | | [Specification: DAG-JSON](block-layer/codecs/dag-json.md) | [block-layer/codecs/dag-json.md](block-layer/codecs/dag-json.md) | | [Specification: DAG-PB](block-layer/codecs/dag-pb.md) | [block-layer/codecs/dag-pb.md](block-layer/codecs/dag-pb.md) | +| [Specification: DAG-JOSE](block-layer/codecs/dag-jose.md) | [block-layer/codecs/dag-jose.md](block-layer/codecs/dag-jose.md) | ## The IPLD Data Model diff --git a/block-layer/codecs/dag-jose.md b/block-layer/codecs/dag-jose.md new file mode 100644 index 00000000..2f66349a --- /dev/null +++ b/block-layer/codecs/dag-jose.md @@ -0,0 +1,43 @@ +# Specification: DAG-JOSE + +**Status: Descriptive - Draft** + +JOSE is a stanard for signing and encrypting JSON objects. The various specifications for JOSE can be found in the [IETF datatracker](https://datatracker.ietf.org/wg/jose/documents/). + +DAG JOSE supports the full [IPLD Data Model](../data-model-layer/data-model.md). + +## Format + +The are two main ways to represent a JOSE node. As a JWS ([json web signature](https://datatracker.ietf.org/doc/rfc7515/?include_text=1)) and JWE ([json web encryption](https://datatracker.ietf.org/doc/rfc7516/?include_text=1)). These two formats acts as the primitives in JOSE and can be used to create JWT and JWM objects. This specification describes how to encode JWS and JWE as an IPLD format. + +### Serialization + +Both JWS and JWE supports different serialization formats: `Compact Serialization`, `Flattened JSON Serialization`, and `General JSON Serialization`. The first two are more concise, but they only allow for one recipient. Therefore DAG JOSE always uses the General Serialization format for maximum compatibility and minimum ambiguity. + +#### Ordering + +Codec implementors **MUST** use the specified order of JOSE properties to ensure hashes consistently match for the same block data. Since JWS and JWE have a strict set of properties this is straight forward. + +##### JWS + +The top level object has two properties which should have the order: `payload` then `signatures`. The `signatures` property contains an array of signature elements. Within each of these elements there are three properties which should have the order: `protected`, `header`, then `signature`. Important to note here is that `protected` and `header` may be absent. + +The content of the `payload`, `signature`, and `protected` properties are `base64url` encoded and therefore does not need any sorting. In contrast, the `header` property contains an unencoded JSON object and should sort object keys by their (UTF-8) encoded representation, i.e. with byte comparisons. + +Finally all whitespace should be stripped. This produces the most compact and consistent representation which will ensure that two codecs producing the same data end up with matching block hashes. + +##### Deserializing JWS + +In it's serialized format a JWS `payload` is encoded using `base64url`. However, this payload may contain IPLD links. Therefore the decoded content of `payload` uses the same approach as [DAG-JSON](./dag-json.md) to support *Bytes Kind* and *Link Kind*. When the DAG-JOSE codec decodes a JWS it should also decode the payload. + +##### JWE + +With JWE there are a few more properties that needs to be in the correct order: `protected`, `unprotected`, `iv`, `aad`, `ciphertext`, `tag`, then `recipients`. Within the `recipients` array each element should have the property order: `header` then `encrypted_key`. Important to note here is that only the `ciphertext` property is required, all other properties may be absent. + +The content of the `protected`, `iv`, `aad`, `ciphertext`, `tag`, and `encrypted_key` properties are `base64url` encoded and therefore does not need any sorting. In contrast, the `unprotected` and `header` property contains unencoded JSON objects and should sort object keys by their (UTF-8) encoded representation, i.e. with byte comparisons. + +Finally all whitespace should be stripped. This produces the most compact and consistent representation which will ensure that two codecs producing the same data end up with matching block hashes. + +##### Decrypting JWE + +Decryption is not directly relevant to the IPLD codec. However, as a useful sidenote it's important to consider that the decrypted message, similar to the decoded JWS payload, may contain *Bytes Kind* and *Link Kind* data. Decrypted data can thus be interpreted as an IPLD dag node. \ No newline at end of file From 9cd7771fddf456c7b754bb92d897ae9791a67ed2 Mon Sep 17 00:00:00 2001 From: Joel Torstensson Date: Fri, 12 Jun 2020 11:05:00 +0200 Subject: [PATCH 2/6] fix: clarify payload property --- block-layer/codecs/dag-jose.md | 30 +++++++++++++++++++++++------- 1 file changed, 23 insertions(+), 7 deletions(-) diff --git a/block-layer/codecs/dag-jose.md b/block-layer/codecs/dag-jose.md index 2f66349a..1f480353 100644 --- a/block-layer/codecs/dag-jose.md +++ b/block-layer/codecs/dag-jose.md @@ -12,13 +12,15 @@ The are two main ways to represent a JOSE node. As a JWS ([json web signature](h ### Serialization -Both JWS and JWE supports different serialization formats: `Compact Serialization`, `Flattened JSON Serialization`, and `General JSON Serialization`. The first two are more concise, but they only allow for one recipient. Therefore DAG JOSE always uses the General Serialization format for maximum compatibility and minimum ambiguity. +Both JWS and JWE supports different serialization formats: `Compact Serialization`, `Flattened JSON Serialization`, and `General JSON Serialization`. The first two are more concise, but they only allow for one recipient. Therefore DAG JOSE always uses the `Compact Serialization` if there is just one recipient, and the `General JSON Serialization` if there are multiple recipients. This ensures maximum compatibility and compactness with minimum ambiguity. + +The implementation of the serialization function should accept all JOSE formats and convert them if necessary. #### Ordering Codec implementors **MUST** use the specified order of JOSE properties to ensure hashes consistently match for the same block data. Since JWS and JWE have a strict set of properties this is straight forward. -##### JWS +#### JWS The top level object has two properties which should have the order: `payload` then `signatures`. The `signatures` property contains an array of signature elements. Within each of these elements there are three properties which should have the order: `protected`, `header`, then `signature`. Important to note here is that `protected` and `header` may be absent. @@ -26,11 +28,24 @@ The content of the `payload`, `signature`, and `protected` properties are `base6 Finally all whitespace should be stripped. This produces the most compact and consistent representation which will ensure that two codecs producing the same data end up with matching block hashes. -##### Deserializing JWS +##### JWS payload + +In it's serialized format a JWS `payload` is encoded using `base64url`. The content of the payload can encode any arbitrary data. To distinguish different data formats a `cty` (Content Type) Header Parameter can be defined. However, it's not required in any way, so it's not something that can be relied upon. It's quite common that the content of the `payload` simply contains JSON. With DAG-JOSE the `payload` is extended to also support [DAG-JSON](./dag-json.md) (with *Bytes Kind* and *Link Kind*). + +This means that the `payload` can be represented in two different ways: + +* `base64url` encoded ([DAG-JSON](./dag-json.md), or arbitrary data) +* Deserialized DAG-JSON -In it's serialized format a JWS `payload` is encoded using `base64url`. However, this payload may contain IPLD links. Therefore the decoded content of `payload` uses the same approach as [DAG-JSON](./dag-json.md) to support *Bytes Kind* and *Link Kind*. When the DAG-JOSE codec decodes a JWS it should also decode the payload. +The serialization function should accept both of these input formats and convert them if necessary. -##### JWE +Note that the JWS signature happens over `ASCII(BASE64URL(UTF8(JWS Protected Header)) || '.' || BASE64URL(JWS Payload))` according to the [JWS specification](https://datatracker.ietf.org/doc/rfc7515/?include_text=1), so if the `payload` contains JSON it need to be ordered in a determinitic way for the signature to always be correct. The DAG-JOSE format should not prefer any specific ordering as different JWS implementations might have different preferences. If [DAG-JSON](./dag-json.md) is used this is however completely mitigated since it uses strict ordering. + +##### Deserializing the payload + +When the JWS is deserialized the `payload` should also be decoded using [DAG-JSON](./dag-json.md) if possible. If [DAG-JSON](./dag-json.md) is not detected, the `payload` should not be decoded. By decoding the payload, standard IPLD tools can be used to traverse the content and potential links within the signed data. + +#### JWE With JWE there are a few more properties that needs to be in the correct order: `protected`, `unprotected`, `iv`, `aad`, `ciphertext`, `tag`, then `recipients`. Within the `recipients` array each element should have the property order: `header` then `encrypted_key`. Important to note here is that only the `ciphertext` property is required, all other properties may be absent. @@ -38,6 +53,7 @@ The content of the `protected`, `iv`, `aad`, `ciphertext`, `tag`, and `encrypted Finally all whitespace should be stripped. This produces the most compact and consistent representation which will ensure that two codecs producing the same data end up with matching block hashes. -##### Decrypting JWE +##### Decrypting the JWE + +Similar to the `payload` of JWS, the decrypted data of a JWE may be encoded as [DAG-JSON](./dag-json.md). The implementation of the decryption function should account for this if neccessary to allow the data be interpreted as an IPLD dag node. -Decryption is not directly relevant to the IPLD codec. However, as a useful sidenote it's important to consider that the decrypted message, similar to the decoded JWS payload, may contain *Bytes Kind* and *Link Kind* data. Decrypted data can thus be interpreted as an IPLD dag node. \ No newline at end of file From 28edd8d0dd6540552bac8df54f6537e1a1a7c295 Mon Sep 17 00:00:00 2001 From: Joel Torstensson Date: Tue, 30 Jun 2020 10:26:19 +0200 Subject: [PATCH 3/6] fix(dag-jose): restucture and clarify content --- block-layer/codecs/dag-jose.md | 117 ++++++++++++++++++++++++++------- 1 file changed, 92 insertions(+), 25 deletions(-) diff --git a/block-layer/codecs/dag-jose.md b/block-layer/codecs/dag-jose.md index 1f480353..cedc1a0e 100644 --- a/block-layer/codecs/dag-jose.md +++ b/block-layer/codecs/dag-jose.md @@ -4,56 +4,123 @@ JOSE is a stanard for signing and encrypting JSON objects. The various specifications for JOSE can be found in the [IETF datatracker](https://datatracker.ietf.org/wg/jose/documents/). -DAG JOSE supports the full [IPLD Data Model](../data-model-layer/data-model.md). +DAG JOSE supports the full [IPLD Data Model](../data-model-layer/data-model.md) (within the payload). ## Format -The are two main ways to represent a JOSE node. As a JWS ([json web signature](https://datatracker.ietf.org/doc/rfc7515/?include_text=1)) and JWE ([json web encryption](https://datatracker.ietf.org/doc/rfc7516/?include_text=1)). These two formats acts as the primitives in JOSE and can be used to create JWT and JWM objects. This specification describes how to encode JWS and JWE as an IPLD format. +The are two main ways to represent a JOSE object. As a JWS ([json web signature](https://datatracker.ietf.org/doc/rfc7515/?include_text=1)) and JWE ([json web encryption](https://datatracker.ietf.org/doc/rfc7516/?include_text=1)). These two formats acts as the primitives in JOSE and can be used to create JWT and JWM objects etc. This specification describes how to encode JWS and JWE as an IPLD format. + +### Representation + +The layout of a decoded JOSE object is described by the IPLD schema defined below. We will refer to this layout as the `Decoded Representation`. + +```ipldsch +type Signature struct { + header optional {String:Any} + protected optional {String:Any} + signature Bytes +} + +type JWS struct { + payload Any + signatures [Signature] +} + +type Recipient struct { + encrypted_key optional Bytes + header optional {String:Any} +} + +type JWE struct { + aad optional Bytes + ciphertext Bytes + iv optional Bytes + protected optional {String:Any} + recipients [Recipient] + tag optional Bytes + unprotected optional {String:Any} +} + +type JOSE union { + | JWS jws + | JWE jwe +} representation kinded +``` ### Serialization -Both JWS and JWE supports different serialization formats: `Compact Serialization`, `Flattened JSON Serialization`, and `General JSON Serialization`. The first two are more concise, but they only allow for one recipient. Therefore DAG JOSE always uses the `Compact Serialization` if there is just one recipient, and the `General JSON Serialization` if there are multiple recipients. This ensures maximum compatibility and compactness with minimum ambiguity. +Both JWS and JWE supports three different serialization formats: `Compact Serialization`, `Flattened JSON Serialization`, and `General JSON Serialization`. The first two are more concise, but they only allow for one recipient. Therefore DAG JOSE always uses the `General Serialization` which ensures maximum compatibility with minimum ambiguity. -The implementation of the serialization function should accept all JOSE formats and convert them if necessary. +The implementation of the serialization function should accept all JOSE formats including the `Decoded Representation` and convert them if necessary. -#### Ordering +#### General JSON Serialization + +Below the `General JSON Serialization` can be observed. Note that all data represented as `String` here is data that has been encoded using `base64url`. Converting `Compact Serialization` and `Flattened JSON Serialization` to the general serialization is trivial. + +```ipldsch +type GeneralSignature struct { + header optional {String:Any} + protected optional String + signature String +} -Codec implementors **MUST** use the specified order of JOSE properties to ensure hashes consistently match for the same block data. Since JWS and JWE have a strict set of properties this is straight forward. +type GeneralJWS struct { + payload String + signatures [GeneralSignature] +} -#### JWS +type GeneralRecipient struct { + encrypted_key optional String + header optional {String:Any} +} -The top level object has two properties which should have the order: `payload` then `signatures`. The `signatures` property contains an array of signature elements. Within each of these elements there are three properties which should have the order: `protected`, `header`, then `signature`. Important to note here is that `protected` and `header` may be absent. +type GeneralJWE struct { + aad optional String + ciphertext String + iv optional String + protected optional String + recipients [GeneralRecipient] + tag optional String + unprotected optional {String:Any} +} -The content of the `payload`, `signature`, and `protected` properties are `base64url` encoded and therefore does not need any sorting. In contrast, the `header` property contains an unencoded JSON object and should sort object keys by their (UTF-8) encoded representation, i.e. with byte comparisons. +type GeneralJOSE union { + | GeneralJWS jws + | GeneralJWE jwe +} representation kinded +``` -Finally all whitespace should be stripped. This produces the most compact and consistent representation which will ensure that two codecs producing the same data end up with matching block hashes. +##### Serializing the Decoded Representation -##### JWS payload +When serializing a JOSE object from the `Decoded Representation` special care needs to be taken with the `payload` property as well as the `protected` properties. -In it's serialized format a JWS `payload` is encoded using `base64url`. The content of the payload can encode any arbitrary data. To distinguish different data formats a `cty` (Content Type) Header Parameter can be defined. However, it's not required in any way, so it's not something that can be relied upon. It's quite common that the content of the `payload` simply contains JSON. With DAG-JOSE the `payload` is extended to also support [DAG-JSON](./dag-json.md) (with *Bytes Kind* and *Link Kind*). +###### Protected -This means that the `payload` can be represented in two different ways: +The `protected` property in JWE and JWS have the type `{String:Any}`. This means that it may include data with *Link Kind* and *Bytes Kind*. These should be converted into pure JSON in the same way as it's done in [DAG-JSON](./dag-json.md). However, the properties should **not** be sorted since that would cause any integrity check on the JOSE data to fail. Once in JSON format the `protected` property should be converted into `base64url` using the method described in the JOSE spec (`BASE64URL(UTF8(data))`). -* `base64url` encoded ([DAG-JSON](./dag-json.md), or arbitrary data) -* Deserialized DAG-JSON +###### Payload -The serialization function should accept both of these input formats and convert them if necessary. +The payload property of JWS can be of either `Bytes` or `{String:Any}` types. If the former it's simply just encoded as `base64url`. If the latter, it should be encoded in the same manner as the `protected` property. -Note that the JWS signature happens over `ASCII(BASE64URL(UTF8(JWS Protected Header)) || '.' || BASE64URL(JWS Payload))` according to the [JWS specification](https://datatracker.ietf.org/doc/rfc7515/?include_text=1), so if the `payload` contains JSON it need to be ordered in a determinitic way for the signature to always be correct. The DAG-JOSE format should not prefer any specific ordering as different JWS implementations might have different preferences. If [DAG-JSON](./dag-json.md) is used this is however completely mitigated since it uses strict ordering. +Note that any change in the ordering of the properties of the payload at this point would cause potential validation of the JOSE object to fail. Good signature libraries will sort the payload before the signature is applied. + +#### Ordering -##### Deserializing the payload +Once the data has been converted to the `General Serialization`, codec implementors **MUST** use the same sorting algorithm as [DAG-JSON](./dag-json.md) to sort the data to ensure hashes consistently match for the same block data. -When the JWS is deserialized the `payload` should also be decoded using [DAG-JSON](./dag-json.md) if possible. If [DAG-JSON](./dag-json.md) is not detected, the `payload` should not be decoded. By decoding the payload, standard IPLD tools can be used to traverse the content and potential links within the signed data. +## Additional information -#### JWE +### Reccomended JOSE creation strategy -With JWE there are a few more properties that needs to be in the correct order: `protected`, `unprotected`, `iv`, `aad`, `ciphertext`, `tag`, then `recipients`. Within the `recipients` array each element should have the property order: `header` then `encrypted_key`. Important to note here is that only the `ciphertext` property is required, all other properties may be absent. +When creating a JOSE object there are some suggested approaches of how to format the data that is being signed / encrypted / authenticated that will keep you out of trouble. The main thing to keep in mind is that signatures / data authentication could be invalidated if the order of the properties in the JOSE object changes. It's therefore a good idea to sort the properties before any signature / authentication is added. The best way to do this is simply to use the same strategy employed by [DAG-JSON](./dag-json.md), which will also convert `Link` and `Bytes` to JSON representation. +For JWS the relevant properties to do this for is `protected` and `payload` since the signature is done over `ASCII(BASE64URL(UTF8(JWS Protected Header)) || '.' || BASE64URL(JWS Payload))` according to the [JWS specification](https://datatracker.ietf.org/doc/rfc7515/?include_text=1). -The content of the `protected`, `iv`, `aad`, `ciphertext`, `tag`, and `encrypted_key` properties are `base64url` encoded and therefore does not need any sorting. In contrast, the `unprotected` and `header` property contains unencoded JSON objects and should sort object keys by their (UTF-8) encoded representation, i.e. with byte comparisons. +For JWE it is `protected` and the cleartext before it is encrypted into `ciphertext`. -Finally all whitespace should be stripped. This produces the most compact and consistent representation which will ensure that two codecs producing the same data end up with matching block hashes. +### Decryption of JWEs -##### Decrypting the JWE +Similar to the `payload` of JWS, the decrypted data of a JWE may be encoded as [DAG-JSON](./dag-json.md) as described above. The implementation of the decryption function should account for this if neccessary to allow the data be interpreted as an IPLD dag node. In the future the decryption itself could be described using an [Advanced IPLD schema layout](../../schemas/advanced-layouts.md). -Similar to the `payload` of JWS, the decrypted data of a JWE may be encoded as [DAG-JSON](./dag-json.md). The implementation of the decryption function should account for this if neccessary to allow the data be interpreted as an IPLD dag node. +### Implementations +* [Javascript](https://github.com/oed/js-dag-jose) \ No newline at end of file From ef94dc57e4032d63af709022c890352088c08686 Mon Sep 17 00:00:00 2001 From: Alex Good Date: Wed, 30 Sep 2020 14:48:41 +0100 Subject: [PATCH 4/6] fix(dag-jose): update to match implementations - Specify that `payload` of JWS and plaintext of JWE ciphertext must be CIDs - Use `Bytes` instead of `String` {String:Any} for authenticated headers - Add clarifying prose around purpose of the encoded and decoded schemas - Add clarifying prose for encrypted padding - Add Go implementation --- block-layer/codecs/dag-jose.md | 118 +++++++++++++-------------------- 1 file changed, 46 insertions(+), 72 deletions(-) diff --git a/block-layer/codecs/dag-jose.md b/block-layer/codecs/dag-jose.md index cedc1a0e..1a449e7d 100644 --- a/block-layer/codecs/dag-jose.md +++ b/block-layer/codecs/dag-jose.md @@ -2,125 +2,99 @@ **Status: Descriptive - Draft** -JOSE is a stanard for signing and encrypting JSON objects. The various specifications for JOSE can be found in the [IETF datatracker](https://datatracker.ietf.org/wg/jose/documents/). - -DAG JOSE supports the full [IPLD Data Model](../data-model-layer/data-model.md) (within the payload). +JOSE is a standard for signing and encrypting JSON objects. The various specifications for JOSE can be found in the [IETF datatracker](https://datatracker.ietf.org/wg/jose/documents/). ## Format -The are two main ways to represent a JOSE object. As a JWS ([json web signature](https://datatracker.ietf.org/doc/rfc7515/?include_text=1)) and JWE ([json web encryption](https://datatracker.ietf.org/doc/rfc7516/?include_text=1)). These two formats acts as the primitives in JOSE and can be used to create JWT and JWM objects etc. This specification describes how to encode JWS and JWE as an IPLD format. +The are two kinds of JOSE objects: JWS ([JSON web signature](https://datatracker.ietf.org/doc/rfc7515/?include_text=1)) and JWE ([JSON web encryption](https://datatracker.ietf.org/doc/rfc7516/?include_text=1)). These two objects are primitives in JOSE and can be used to create JWT and JWM objects etc. The IETF RFCs specify a JSON encoding of JOSE objects. This specification maps the JSON encoding to CBOR. Upon encountering the `dag-jose` multiformat implementations can be sure that the block contains dag-cbor encoded data which matches the IPLD schema we specify below. + +### Mapping from the JOSE general JSON serialization to dag-jose serialization + +Both JWS and JWE supports three different serialization formats: `Compact Serialization`, `Flattened JSON Serialization`, and `General JSON Serialization`. The first two are more concise, but they only allow for one recipient. Therefore DAG JOSE always uses the `General Serialization` which ensures maximum compatibility with minimum ambiguity. Libraries implementing serialization should accept all JOSE formats including the `Decoded Representation` (see below) and convert them if necessary. + +To map the general JSON serialization to CBOR we do the following: -### Representation +- Any field which is represented as `base64url()` we map directly to `Bytes` . For fields like `header` and `protected` which are specified as the `base64url(ascii())` that means that the value is the `ascii()` bytes. +- For JWS we specify that the `payload` property MUST be a CID, and we set the `payload` of the encoded JOSE object to `Bytes` containing the bytes of the CID. For applications where an additional network request to retrieve the linked content is undesirable then an `identity` multihash should be used. +- For JWE objects the `ciphertext` must decrypt to a plaintext which is the bytes of a CID. This is for the same reason as the `payload` being a CID, and the same approach of using an `identity` multihash can be used, and most likely will be the only way to retain the confidentiality of data. -The layout of a decoded JOSE object is described by the IPLD schema defined below. We will refer to this layout as the `Decoded Representation`. +Below we present an IPLD schema representing the encoded JOSE objects. Note that the `EncodedJOSE` union is not in fact a valid IPLD schema as there is no valid discriminator. The actual wire format is a single struct which contains all the keys from both the `EncodedJWE` and the `EncodedJWS` structs, implementors should follow [section 9 of the JWE spec](https://tools.ietf.org/html/rfc7516#section-9) and distinguish between these two branches of the union by checking if the `payload` attribute exists, and hence you have a JWS; or the `ciphertext` attribute, hence you have a JWE. + +**Encoded JOSE** ```ipldsch -type Signature struct { +type EncodedSignature struct { header optional {String:Any} - protected optional {String:Any} + protected optional Bytes signature Bytes } -type JWS struct { - payload Any - signatures [Signature] -} - -type Recipient struct { +type EncodedRecipient struct { encrypted_key optional Bytes header optional {String:Any} } -type JWE struct { +type EncodedJWE struct { aad optional Bytes ciphertext Bytes iv optional Bytes - protected optional {String:Any} - recipients [Recipient] + protected optional Bytes + recipients [EncodedRecipient] tag optional Bytes unprotected optional {String:Any} } -type JOSE union { - | JWS jws - | JWE jwe -} representation kinded +type EncodedJWS struct { + signatures [EncodedSignature] + payload optional Bytes +} + +type EncodedJOSE union { EncodedJWE | EncodedJWS } ``` -### Serialization +## Padding for encryption + +Applications may need to pad the plaintext when encrypting to avoid leaking the size of the plaintext. This raises the question of how the application knows what part of the decrypted plaintext is padding. In this case we use the fact that the plaintext MUST be a valid CID, implementations should parse the plaintext as a CID and discard any content beyond the multihash digest size - which we assume to be the padding. -Both JWS and JWE supports three different serialization formats: `Compact Serialization`, `Flattened JSON Serialization`, and `General JSON Serialization`. The first two are more concise, but they only allow for one recipient. Therefore DAG JOSE always uses the `General Serialization` which ensures maximum compatibility with minimum ambiguity. -The implementation of the serialization function should accept all JOSE formats including the `Decoded Representation` and convert them if necessary. +## Decoded JOSE -#### General JSON Serialization +Typically implementations will want to decode this format into something more useful for applications. Exactly what that will look like depends on the language of the implementation, here we use the IPLD schema language to give a somewhat language agnostic description of what the decoded representation might look like at runtime. Note that everything which is specified as `base64url(ascii())` in the JOSE specs - and which we encode as `Bytes` in the wire format - is here decoded to a `String`. We also add the `link: &Any` attribute to the `DecodedJWS`, which allows applications to easily retrieve the authenticated content. -Below the `General JSON Serialization` can be observed. Note that all data represented as `String` here is data that has been encoded using `base64url`. Converting `Compact Serialization` and `Flattened JSON Serialization` to the general serialization is trivial. +Also note that - as with the encoded representation - the `DecodedJOSE` union is not valid IPLD schema as there is no way to discriminate between them. How exactly this would be represented will depend on the language of the implementation. For example in Typescript this would be a straightforward `type DecodedJOSE = DecodedJWE | DecodedJWS`. ```ipldsch -type GeneralSignature struct { +type DecodedSignature struct { header optional {String:Any} protected optional String signature String } -type GeneralJWS struct { +type DecodedJWS struct { payload String - signatures [GeneralSignature] + signatures [DecodedSignature] + link: &Any } -type GeneralRecipient struct { +type DecodedRecipient struct { encrypted_key optional String header optional {String:Any} } -type GeneralJWE struct { +type DecodedJWE struct { aad optional String ciphertext String - iv optional String - protected optional String - recipients [GeneralRecipient] - tag optional String + iv String + protected String + recipients [DecodedRecipient] + tag String unprotected optional {String:Any} } -type GeneralJOSE union { - | GeneralJWS jws - | GeneralJWE jwe -} representation kinded +type DecodedJOSE union { DecodedJWE | DecodedJWS } ``` -##### Serializing the Decoded Representation - -When serializing a JOSE object from the `Decoded Representation` special care needs to be taken with the `payload` property as well as the `protected` properties. - -###### Protected - -The `protected` property in JWE and JWS have the type `{String:Any}`. This means that it may include data with *Link Kind* and *Bytes Kind*. These should be converted into pure JSON in the same way as it's done in [DAG-JSON](./dag-json.md). However, the properties should **not** be sorted since that would cause any integrity check on the JOSE data to fail. Once in JSON format the `protected` property should be converted into `base64url` using the method described in the JOSE spec (`BASE64URL(UTF8(data))`). - -###### Payload - -The payload property of JWS can be of either `Bytes` or `{String:Any}` types. If the former it's simply just encoded as `base64url`. If the latter, it should be encoded in the same manner as the `protected` property. - -Note that any change in the ordering of the properties of the payload at this point would cause potential validation of the JOSE object to fail. Good signature libraries will sort the payload before the signature is applied. - -#### Ordering - -Once the data has been converted to the `General Serialization`, codec implementors **MUST** use the same sorting algorithm as [DAG-JSON](./dag-json.md) to sort the data to ensure hashes consistently match for the same block data. - -## Additional information - -### Reccomended JOSE creation strategy - -When creating a JOSE object there are some suggested approaches of how to format the data that is being signed / encrypted / authenticated that will keep you out of trouble. The main thing to keep in mind is that signatures / data authentication could be invalidated if the order of the properties in the JOSE object changes. It's therefore a good idea to sort the properties before any signature / authentication is added. The best way to do this is simply to use the same strategy employed by [DAG-JSON](./dag-json.md), which will also convert `Link` and `Bytes` to JSON representation. -For JWS the relevant properties to do this for is `protected` and `payload` since the signature is done over `ASCII(BASE64URL(UTF8(JWS Protected Header)) || '.' || BASE64URL(JWS Payload))` according to the [JWS specification](https://datatracker.ietf.org/doc/rfc7515/?include_text=1). - -For JWE it is `protected` and the cleartext before it is encrypted into `ciphertext`. - -### Decryption of JWEs - -Similar to the `payload` of JWS, the decrypted data of a JWE may be encoded as [DAG-JSON](./dag-json.md) as described above. The implementation of the decryption function should account for this if neccessary to allow the data be interpreted as an IPLD dag node. In the future the decryption itself could be described using an [Advanced IPLD schema layout](../../schemas/advanced-layouts.md). - -### Implementations +##Implementations -* [Javascript](https://github.com/oed/js-dag-jose) \ No newline at end of file +- [Javascript](https://github.com/oed/js-dag-jose) +- [Go](https://github.com/alexjg/go-dag-jose) From 575840aae53f9dffef6a25226ca41d2fbea95ad3 Mon Sep 17 00:00:00 2001 From: Joel Thorstensson Date: Thu, 12 Nov 2020 12:09:22 +0100 Subject: [PATCH 5/6] Update block-layer/codecs/dag-jose.md --- block-layer/codecs/dag-jose.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block-layer/codecs/dag-jose.md b/block-layer/codecs/dag-jose.md index 1a449e7d..35f6d380 100644 --- a/block-layer/codecs/dag-jose.md +++ b/block-layer/codecs/dag-jose.md @@ -94,7 +94,7 @@ type DecodedJWE struct { type DecodedJOSE union { DecodedJWE | DecodedJWS } ``` -##Implementations +## Implementations - [Javascript](https://github.com/oed/js-dag-jose) - [Go](https://github.com/alexjg/go-dag-jose) From 81be217ce66f9f371a7d2b997c1aa2592bfcb419 Mon Sep 17 00:00:00 2001 From: Joel Thorstensson Date: Thu, 26 Nov 2020 20:32:50 +0100 Subject: [PATCH 6/6] fix: remove Unions --- block-layer/codecs/dag-jose.md | 14 +++++--------- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git a/block-layer/codecs/dag-jose.md b/block-layer/codecs/dag-jose.md index 35f6d380..9dbdc166 100644 --- a/block-layer/codecs/dag-jose.md +++ b/block-layer/codecs/dag-jose.md @@ -16,9 +16,9 @@ To map the general JSON serialization to CBOR we do the following: - Any field which is represented as `base64url()` we map directly to `Bytes` . For fields like `header` and `protected` which are specified as the `base64url(ascii())` that means that the value is the `ascii()` bytes. - For JWS we specify that the `payload` property MUST be a CID, and we set the `payload` of the encoded JOSE object to `Bytes` containing the bytes of the CID. For applications where an additional network request to retrieve the linked content is undesirable then an `identity` multihash should be used. -- For JWE objects the `ciphertext` must decrypt to a plaintext which is the bytes of a CID. This is for the same reason as the `payload` being a CID, and the same approach of using an `identity` multihash can be used, and most likely will be the only way to retain the confidentiality of data. +- For JWE objects the `ciphertext` must decrypt to a cleartext which is the bytes of a CID. This is for the same reason as the `payload` being a CID, and the same approach of using an `identity` multihash can be used, and most likely will be the only way to retain the confidentiality of data. -Below we present an IPLD schema representing the encoded JOSE objects. Note that the `EncodedJOSE` union is not in fact a valid IPLD schema as there is no valid discriminator. The actual wire format is a single struct which contains all the keys from both the `EncodedJWE` and the `EncodedJWS` structs, implementors should follow [section 9 of the JWE spec](https://tools.ietf.org/html/rfc7516#section-9) and distinguish between these two branches of the union by checking if the `payload` attribute exists, and hence you have a JWS; or the `ciphertext` attribute, hence you have a JWE. +Below we present an IPLD schema representing the encoded JOSE objects. Note that there are two IPLD schemas, `EncodedJWE` and `EncodedJWS`. The actual wire format is a single struct which contains all the keys from both the `EncodedJWE` and the `EncodedJWS` structs, implementors should follow [section 9 of the JWE spec](https://tools.ietf.org/html/rfc7516#section-9) and distinguish between these two branches by checking if the `payload` attribute exists, and hence you have a JWS; or the `ciphertext` attribute, hence you have a JWE. **Encoded JOSE** @@ -45,23 +45,21 @@ type EncodedJWE struct { } type EncodedJWS struct { - signatures [EncodedSignature] payload optional Bytes + signatures [EncodedSignature] } - -type EncodedJOSE union { EncodedJWE | EncodedJWS } ``` ## Padding for encryption -Applications may need to pad the plaintext when encrypting to avoid leaking the size of the plaintext. This raises the question of how the application knows what part of the decrypted plaintext is padding. In this case we use the fact that the plaintext MUST be a valid CID, implementations should parse the plaintext as a CID and discard any content beyond the multihash digest size - which we assume to be the padding. +Applications may need to pad the cleartext when encrypting to avoid leaking the size of the cleartext. This raises the question of how the application knows what part of the decrypted cleartext is padding. In this case we use the fact that the cleartext MUST be a valid CID, implementations should parse the cleartext as a CID and discard any content beyond the multihash digest size - which we assume to be the padding. ## Decoded JOSE Typically implementations will want to decode this format into something more useful for applications. Exactly what that will look like depends on the language of the implementation, here we use the IPLD schema language to give a somewhat language agnostic description of what the decoded representation might look like at runtime. Note that everything which is specified as `base64url(ascii())` in the JOSE specs - and which we encode as `Bytes` in the wire format - is here decoded to a `String`. We also add the `link: &Any` attribute to the `DecodedJWS`, which allows applications to easily retrieve the authenticated content. -Also note that - as with the encoded representation - the `DecodedJOSE` union is not valid IPLD schema as there is no way to discriminate between them. How exactly this would be represented will depend on the language of the implementation. For example in Typescript this would be a straightforward `type DecodedJOSE = DecodedJWE | DecodedJWS`. +Also note that, as with the encoded representation, there are two different representations; `DecodedJWE` and `DecodedJWS`. Applications can distinguish between these two branches in the same way as with the Encoded representation described above. ```ipldsch type DecodedSignature struct { @@ -90,8 +88,6 @@ type DecodedJWE struct { tag String unprotected optional {String:Any} } - -type DecodedJOSE union { DecodedJWE | DecodedJWS } ``` ## Implementations