Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-5311] Allow JSONDecoder to somehow provide loose JSON structure #47886

Open
swift-ci opened this issue Jun 26, 2017 · 31 comments
Open

[SR-5311] Allow JSONDecoder to somehow provide loose JSON structure #47886

swift-ci opened this issue Jun 26, 2017 · 31 comments
Assignees
Labels
compiler The Swift compiler in itself improvement standard library Area: Standard library umbrella

Comments

@swift-ci
Copy link
Collaborator

Previous ID SR-5311
Radar rdar://problem/32983213
Original Reporter hannesoid (JIRA User)
Type Improvement
Additional Detail from JIRA
Votes 3
Component/s Compiler, Standard Library
Labels Improvement
Assignee @itaiferber
Priority Medium

md5: 5eb72b813df2f976700cdaab0d43900e

Issue Description:

I have an API, which is mostly strongly typed, but has some part that has a JSON dictionary, with arbitrary elements, which could be used for debugging purposes. At first I tried to map this to my Decodable type via a property of Dictionary type String: Decodable but this didn't work, because swift needs more type info. I then came up with the attached enum for representing a random JSON structure, sort of similar to what NSJSONSerialization provides (see below).

I assume the JSONDecoder has a representation of this structure already.
Could something like this be obtained from the JSONDecoder somehow?
Suggestion would be let Decoder have a typealias CanonicalStructure = JSONValue and a the singleValueContainer have a decoder.decodeCanonicalStructure() throws ?

Here's my quick idea for such a structure https://gist.github.com/hannesoid/10a35895e4dc5d6f1bb6428f7d4d23a5

@belkadan
Copy link
Contributor

cc @itaiferber

@itaiferber
Copy link
Contributor

We've had multiple requests for this type of unevaluated JSON, both on encode and on decode. Definitely on our plans, but at this point it's a bit of an additive feature.

@itaiferber
Copy link
Contributor

@swift-ci create

@swift-ci
Copy link
Collaborator Author

Comment by Hannes Oud (JIRA)

Cool, thanks for the feedback!

@lilyball
Copy link
Mannequin

lilyball mannequin commented Feb 14, 2018

This seems like a huge oversight and it's causing real problems for me right now. I have to accurately describe the structure of a complicated sub-object simply because it's impossible to decode it as a Dictionary, even though the only thing I do with it is re-encode it as JSON a bit later and pass it to another service. And this is a problem because the format of the dictionary is actually a bit in flux right now, so I can't even write the code yet as it is likely to be wrong.

There's also no way to implement this in Decoder right now, because there's no way to ask Decoder if it even knows what the "native" type for a key is.

@itaiferber
Copy link
Contributor

Eridius (JIRA User) We're still interested in offering this, but haven't had the bandwidth to design and implement what the API might look like. See also some of the discussion here. This is possible to work around in the meantime by creating a JSON enum or similar which can represent the different JSON types, and then decode as [String : JSON]; that being said, we'd definitely like to improve the ergonomics of this.

As for asking a Decoder what the "native" type for a key is... Well, there isn't always a good (or right) answer. 1 in JSON, for instance, can be mapped to a Swift Int8, UInt8, Int16, ..., UInt16, Float, Double, or Float80, among others. The most accurate "native" type is debatable.

@lilyball
Copy link
Mannequin

lilyball mannequin commented Feb 14, 2018

How do I decode a JSON enum? How would I implement init(from:) on the enum? There's no way to ask the decoder "what's the native type here". The only way I can see to do this is to try to decode each possible JSON type in turn, catching the error and throwing it away, but a) that's pretty awful, and b) does Decoder even guarantee that failing to decode, throwing away the error, and trying again is actually supported? Plus if the decoder does any sort of type coercion (e.g. decoding an integer as a string) then you'd end up with the wrong result.

@itaiferber
Copy link
Contributor

Eridius (JIRA User) Yes, the model for attempting to decode one of multiple disparate types is to try them in turn until one succeeds. Decoder guarantees it, and JSONDecoder doesn't currently perform any type coercion at all — you get exactly what you ask for. It's up to you to ask for types in the order that you would prefer them, leaving the "is this a Int8 or Int16 or ... or Double" decision up to you; if you care only about Double or Int, then there's no real question.

@itaiferber
Copy link
Contributor

To be clear, asking a given Decoder for "what's in here?" might not always be possible. We can add a JSONDecoder.UnevaluatedJSON type because JSONDecoder is built on top of JSONSerialization and can return the underlying data returned from that. (This, BTW, solves the "there's a number here" problem by returning an NSNumber.) But this won't necessarily be possible for MyCustomFormatDecoder, whose format might not have any sort of reasonable mapping by default.

@lilyball
Copy link
Mannequin

lilyball mannequin commented Feb 14, 2018

There's 2 basic types of decoders: self-describing ones (like JSON) and non-described one (where the interpretation of the input depends on the type you ask for). Self-describing ones should be able to declare their type, the others obviously can't.

One possible solution here is to have another protocol TypedDecoder: Decoder or somesuch that represents decoders for self-describing formats. A typed decoder would also override the 3 container types to be typed as well, and the typed containers would offer a way to query the type.

Then we could have TypedDecodable for types that require a TypedDecoder. We could declare Decodable: TypedDecodable and use a protocol extension to fulfill the init(from: TypedDecoder) method. And then JSONDecoder can conform to TypedDecoder.

The alternative is to just have the decoder containers have something like type(forKey: Key) -> Decodable.Type? that returns nil for non-self-describing decoders. The downside to this is we reintroduce the issue Swift 4.0 has with Dictionary/Array in that we'll have types that may simply throw an error at runtime for something that should have been caught at compile-time, but the upshot is it is a lot simpler.

@lilyball
Copy link
Mannequin

lilyball mannequin commented Feb 14, 2018

BTW where does Decoder guarantee that you can throw away decoding errors and try again without a problem? I'm not finding it in the documentation.

@itaiferber
Copy link
Contributor

Eridius (JIRA User) What does TypedDecoder do that Decoder doesn't? Or is the suggestion that TypedDecoder offers an associated Unevaluated type that can return the arbitrary contents of the payload without touching them?

Decoder itself doesn't guarantee it, but its containers implicitly do. Both KeyedDecodingContainer and SingleValueDecodingContainer are immutable, and UnkeyedDecodingContainer is only allowed to increment currentIndex on successful decodes. Various places in the Codable infrastructure rely on this, but it's impossible to verify statically.

@lilyball
Copy link
Mannequin

lilyball mannequin commented Feb 14, 2018

I was thinking that e.g. TypedKeyedDecodingContainer would have type(forKey:) and so you'd write something like

let value = try container.decode(container.type(forKey: key), forKey: key)

though I apparently hadn't actually thought it through because that is a little awkward and it would probably be simpler just to add a single method that means "return the underlying value without decoding it as anything". If Int, etc conformed to Decodable then I would suggest something like func decode(_ type: Decodable.Protocol, forKey: Key) -> Decodable, except the primitives don't actually conform to Decodable so we can't write this (though we could of course declare conformance on the primitives, and maybe we should since they can trivially implement the protocol). Barring declaring conformance to Decodable on the primitives we could instead return some sort of Unevaluated enum.

@itaiferber
Copy link
Contributor

Yeah, unfortunately, there are two main issues there:

  1. There might not be something reasonable to return from `type(forKey🙂` — again, what type does JSONDecoder return for type(forKey: .value) on {"value": 1}?

  2. The result of that could only be used with a non-generic func decode(_ type: Decodable.Protocol, forKey: Key) -> Decodable), as you mention. However, this still wouldn't solve the problem completely as heterogeneous collections would still be impossible to represent — if you've got a JSON payload of {"values": [1, "hello", null]}, you're looking at a [String : [Any]]. [String : [Any]] can't be Decodable because [Any] can't be Decodable because Any isn't Decodable.

The best solution at the moment, I think, would be to introduce Decoder-specific marker types that are handled on a case-by-case basis by the decoder, e.g.

struct JSONDecoder {
    struct UnevaluatedJSON : Decodable {
        let value: Any
        init(from decoder: Decoder) throws { throw /* can only be decoded by JSONDecoder */ }
    }
}

JSONDecoder's containers can then special-case UnevaluatedJSON to simply return whatever they already contain for that key/location.

@itaiferber
Copy link
Contributor

Also, all the primitive types are Codable:

let values: [Any] = [
    true,
    (1 as Int8),
    (1 as Int16),
    (1 as Int32),
    (1 as Int64),
    (1 as Int),
    (1 as UInt8),
    (1 as UInt16),
    (1 as UInt32),
    (1 as UInt64),
    (1 as UInt),
    (1 as Float),
    (1 as Double),
    "hello"
]

for value in values {
    print("\(type(of: value)) is Codable: \(value is Codable)")
}

produces

Bool is Codable: true
Int8 is Codable: true
Int16 is Codable: true
Int32 is Codable: true
Int64 is Codable: true
Int is Codable: true
UInt8 is Codable: true
UInt16 is Codable: true
UInt32 is Codable: true
UInt64 is Codable: true
UInt is Codable: true
Float is Codable: true
Double is Codable: true
String is Codable: true

@lilyball
Copy link
Mannequin

lilyball mannequin commented Feb 14, 2018

1. JSONDecoder could either return Double or it could inspect the NSNumber to find out what the wrapped primitive number is and return that (in case the underlying NSJSONSerialization actually bothers to determine that it can be represented as an integer and stores it that way).
2. Good point about heterogenous arrays. But it could always return [TypedDecodable] as the array type. We'd need Array and Dictionary to declare explicit conformance to TypedDecodable when the value type itself is the TypedDecodable existential, but we have to do that anyway if we actually want to be able to declare a struct as containing e..g [String: TypedDecodable] and have it decode.

@lilyball
Copy link
Mannequin

lilyball mannequin commented Feb 14, 2018

Are the primitives conforming to Codable new? It doesn't appear to be this way in Swift 4.0. In any case, that makes my proposal more workable.

@itaiferber
Copy link
Contributor

No, it's definitely been this way since 4.0. If you run the code above, it returns false?

@lilyball
Copy link
Mannequin

lilyball mannequin commented Feb 14, 2018

Hmm you're right, it prints true. I can't find the conformance in the documentation though.

@itaiferber
Copy link
Contributor

Hmm — looks like the documentation lists encode(to🙂 and init(from🙂 in an Encoding and Decoding section, but doesn't list Encodable or Decodable in the list of conformances. Seems like that's worth a Radar.

@lilyball
Copy link
Mannequin

lilyball mannequin commented Feb 14, 2018

Filed as rdar://problem/37547329.

@itaiferber
Copy link
Contributor

Excellent, thanks!

However, back to your actual points:

  1. It could return Double, but for numbers which are not representable without loss of precision in Double, it would have to fall back to a different type (e.g. for any integral value > 2^53). It could return Int64 or UInt64 in those cases, but this would be surprising to someone working in the JavaScript mindset where everything is Double and loss of precision is fine (and expected). For smaller numbers, some folks expect to default to Swift's de-facto integer type Int, while others would prefer the smallest type possible which could hold the number (so 1 should be Int8). In the Swift model where numeric values are not implicitly convertible to one another, having the wrong assumptions in your head can be confusing because you'll see as-casts fail in unexpected places. I bring this up not to be contrarian, but to bring up real concerns that have come up during API discussions where real-world scenarios had different bearings on folks expectations. (Thus, BTW, inverting the problem — there's no issue of user expectation when it's the user expressing the type they want rather than what type exists and getting a surprising answer.) Another solution is to pass the buck further along and simply return NSNumber and let someone else unwrap the type-erased box. This works much more nicely on Darwin platforms than on Linux, where you can't as-cast to get a real value out of the NSNumber.

  2. We can add extension Array : TypedDecodable where Element : TypedDecodable and extension Dictionary where Element : TypedDecodable, Value : TypedDecodable, but that still won't solve the problem — [String: TypedDecodable] itself can't conform to TypedDecodable because TypedDecodable doesn't conform to itself. (Same as [String: Decodable] because Decodable doesn't conform to itself.)

@lilyball
Copy link
Mannequin

lilyball mannequin commented Feb 14, 2018

1. I suspect the best solution is probably to inspect the underlying type of the NSNumber, if it's an integral type then return the smallest of Int, Int64, UInt64 that fits it, otherwise return Double.
2. Right, which is why you need Extension Array : TypedDecodable where Element == TypedDecodable and the same for Dictionary. That said, I haven't actually experimented how Swift 4.1 behaves when you declare multiple specialized protocol conformances on the same type.

@itaiferber
Copy link
Contributor

I think many people are surprised in practice by the result of the boxed type inside of an NSNumber (if integral, always Int64 unless too large, in which case UInt64; otherwise Double). 🙂

Also, you can't conform a type to the same protocol multiple times using conditional conformance, which means we could either have where Element : TypedDecodable or where Element == TypedDecodable, but not both.

@lilyball
Copy link
Mannequin

lilyball mannequin commented Feb 14, 2018

Dammit. That limitation pretty much requires us to use some sort of type-erased struct as the array/dictionary value simply so it can conform to TypedDecodable then. I really wish we could declare that protocol existentials conform to the protocol.

@itaiferber
Copy link
Contributor

Regardless, I don't think TypedDecodable is the direction we'd want to take this solution in, for a few reasons:

  1. It splits Codable into two based on the encoding format. This split based on format is something we tried very hard to avoid, even at the cost of accepting that some formats simply wouldn't work well with Codable; it's meant to be a minimal, least-common-denominator-which-would-give-us-90%-coverage way of expressing encoding and decoding. Splitting it up based on whether a format could be self-describing or not defeats that purpose

  2. Adding a second protocol more than doubles the surface area of the API and makes things potentially much more confusing. (Developer questions: When should a type conform with TypedDecodable over Decodable and vice versa? Can TypedDecodable conformance be easily synthesized by the compiler? If TypedDecodable exists, why even bother with Decodable? [or vice versa])

  3. Strangeness/impossibility of inheritance in the APIs. Swift protocols cannot refine method/initializer requirements from other protocols, so if you define

    protocol TypedDecodable {
        init(from decoder: TypedDecoder)
    }
    
    protocol Decodable : TypedDecodable {
        init(from decoder: Decoder)
    }

    a type conforming to Decodable would need to provide both init(from🙂 methods, which defeats the purpose. This means that the two protocols can't be related to one another, and so further bifurcates the API.

There are some other issues at hand, but in general, I think that going the Unevaluated route solves this issue more cleanly. Either:

  1. If you care about what you're trying to decode, provide a concrete Codable type and actually decode it; I've yet to see a case where input is so truly unbounded in type it's impossible to represent, or

  2. If you don't care about what you're trying to decode (you just need to pass it along), decode Unevaluated

@lilyball
Copy link
Mannequin

lilyball mannequin commented Feb 14, 2018

It doesn't double the API. Protocol inheritance means you get to re-use most of it. I think there is a bit of confusion here with regards to TypedDecoder wants to conform to Decoder but on the decodable side it's flipped, but I haven't actually written out all the types involved yet so I'm not sure if there is actually a problem here or not.

In any case, the Unevaluated route basically just means you risk runtime errors if you have a decoder for a non-self-describing protocol, because there's no compile-time marker that says whether or not a type requires the protocol to be self-describing. This is fine if we don't really expect anyone to write decoders for non-self-describing protocols, but it's not very forward-looking, and it seems perfectly reasonable for someone to want to be able to write a decoder for a non-self-describing protocol.

@itaiferber
Copy link
Contributor

Here's the issue:

  • If TypedDecodable : Decodable, you can pass in TypedDecodable anywhere that accepts Decodable. But, TypedDecodable has more stringent requirements than Decodable does, so you risk runtime failure; this is an inversion of inheritance

  • If Decodable : TypedDecodable, you need to be able to implement the init(from: TypedDecoder) requirement in terms of init(from: Decoder) in an extension so clients only need to implement init(from: Decoder). This isn't possible because TypedDecoder is a more specific type than Decoder — you can't implement something requiring TypedDecoder in terms of Decoder or else we wouldn't have this problem in the first place

@lilyball
Copy link
Mannequin

lilyball mannequin commented Feb 14, 2018

Ah yes. So I guess you wouldn't have either decodable inherit from the other, and instead you'd just extend the typed decoder containers to have one extra decode<T: TypedDecodable>(_ type: T.Type, forKey: Key) -> T method, so you can decode both typed and untyped decodables that way (and the existing untyped decoders can only decode untyped decodables).

@itaiferber
Copy link
Contributor

I think that still undesirably splits the API based on the format. Now to support all formats, you need to conform with both Decodable and TypedDecodable, and in the case that you need to customize your type, implement both init(from: Decoder) and init(from: TypedDecoder), likely duplicating most, if not all of the code between them. This would only get worse if we had to duplicate Encodable into TypedEncodable for symmetry (and if we didn't, the lack of symmetry would be confusing too).

I agree that Unevaluated can lead to failure modes at runtime (given a way to query "does this decoder support an unevaluated type?", does your type even know what to do in both cases?), but it vastly simplifies the story for the vast majority of developers.

@lilyball
Copy link
Mannequin

lilyball mannequin commented Feb 15, 2018

No you'd conform to either Decodable or TypedDecodable, not both. The typed decoder containers could decode both Decodable and TypedDecodable types so there's no worries there.

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler The Swift compiler in itself improvement standard library Area: Standard library umbrella
Projects
None yet
Development

No branches or pull requests

3 participants