Show HN: Xfer, a data-transfer language

17 pointsposted 4 days ago
by paulmooreparks

12 Comments

jottinger

2 hours ago

I think it looks okay, but noisy, and my main concern is that it adds so little to JSON - in particular, RJSON. The built-in token replacement MIGHT be nice, but... is token replacement part of the library itself? Or is that something applied elsewhere?

James_K

2 days ago

There is no need to have a digraph surrounding every element of data. Beyond being ugly, the digraphs make the file almost impossible to edit for anyone not already familiar with them. They require a large number of random symbols to be memorised. The strict typing is also not really a benefit. You can define the desired types of elements in a schema before reading data (and will need to do that anyway for validation purposes). Having the types also transcribed in the file means that writing out the file and changing the types of data would be much harder. As an example, when you force a file to specify if integers are 32-bits or 64, you make it less general (it will only work with parser that match that number of bits) and require extra data to do that. A double loss over just writing the integer out. Even if you are dedicated to strict typing, it makes more sense to do it with something along the lines of C literals (12f, 100l, etc.) because people are already used to that. This has the added benefit that the letter you put after the number is also the first letter of the type you want it to be, which is easier to remember than ^ for double and * for decimal. It's also not needed to put the identifier on both the opening and closing angle brackets. The most obvious notation would be something akin to what you might see in an assembly file, eg I32.900 for 900 as a 32-bit integer.

JSON clearly wins this comparison because people can look at the JSON and guess what it means. People aren't going to go for a more complicated format that makes less sense. I personally prefer Clojure's EDN when it comes to data formats just because it looks nicer and removes redundant syntax. I don't think there's much to be improved upon from there.

knowitnone

2 days ago

Agree, I'd rather see JSON extended to something dumb like this { "name": "Alice" :\s, "age": 30 :\d, "isMember": true :\b, "scores": [85, 90, 78.5] :\a, "profile": { "email": "alice@example.com" :\s, "joinedDate": "2023-01-15T12:00:00" :\t } }

otabdeveloper4

2 days ago

> "scores": [85, 90, 78.5] :\a

You mean `\ad` , obviously. (You see where I'm going with this?)

paulmooreparks

2 days ago

Thanks... yeah, I did play with that idea a little bit, and I may loop back around to that.

MisterKent

2 days ago

Personally, I don't see a huge upsell over JSON for interop. Typing is great, but largely mitigated by things like typescript, graphql generators etc etc. edit: and json schema

The biggest issue I see with JSON today is the inability to self reference values. Data duplication overhead is high, and deserializing doesn't work as expected if you want two objects to be reference equal.

Also, nitpick: that syntax would be a grind to write manually vs JSON.

paulmooreparks

2 days ago

Thanks so much for your input. Yeah, it's a bit more work to type, but I've already gotten a bit faster (even though I change the syntax so much). Copilot has been helpful, too, once it learned what I was trying to do. A proper syntax helper for VS Code and Visual Studio would probably make things a lot easier, too.

I haven't thought about self-reference. I'll pin that to the list of things to play with. For me, it was comments, placeholders, and typing, and those drove the syntax evolution, especially with regard to nesting of elements. Trust me, you would have really hated the first couple of iterations.

Terretta

2 days ago

I love that you are noodling in the open, and getting feedback / impressions early -- that's the recipe for doing something new, better!

Now, unfiltered first thoughts...

Have you used SOAP RPC and is it clear why the industry (outside of banking and financial services that depend on correctness!) largely shifted to JSONic REST?

At first exposure to this, the delimiters feel like line noise, as if SOAP and a lineprinter got together and said, "How could this offer less utility while being more opaque?"

There's a cognitive tax on something being unlike other things, which is why so many substitutions end up adopting handlebars/mustache type syntax, or so many things look like INI/TOML/YAML/etc.

Have you looked at JSON5, XML + XSLT, JSONNET and the like? What about JSON Schema and similar? ( https://json-schema.org/understanding-json-schema/reference/... )

Not saying any of those are be-all end-all, they are not. But it would be super helpful to have a "Why not X" page on your README that shows you've considered SOTA across the landscape, and why that doesn't work for you. See, for example: https://hitchdev.com/hitchstory/why-not/

paulmooreparks

2 days ago

Thanks so much! It's really beautiful to be able to share ideas and code so openly and get immediate feedback like this.

I have used SOAP on some large projects, and I wasn't a fan. It just feels like, with JSON, the pendulum swung pretty far the other way. Thanks for those references to other standards; I'll look those over and see what I can learn.

I think where JSON really chafes me is in places like configuration files, where I'd like a little finer control and more features, and Xfer has probably shifted more in that direction. It's certainly nowhere near ready for production, and I think it will probably remain an experiment, but it's been fun and educational.

paulmooreparks

2 days ago

Author here. Thank you SO MUCH for the comments. This is more feedback than I could have hoped for. There's no danger of this language replacing JSON or XML any time soon, but it's been a really useful exercise for learning about the issues involved in designing such a language, and this feedback helps me learn even more. I'll keep iterating with these comments in mind.

ghjfrdghibt

2 days ago

I'm reminded of XML where all the information is store in attributes.

paulmooreparks

2 days ago

This certainly has some similarities to XML. What I never liked about XML was that the smallest possible element tags consumed a minimum of seven characters, whereas Xfer only taxes you with four. Surprisingly, minified Xfer only takes an additional 10 to 15% more space than equivalent JSON. That's still a LOT, but I was hoping the additional length would be a worthy trade-off for features. Maybe there's a more efficient way to get the same features (as someone else commented here).