Avro in Ruby
This is a short example showing the use of Avro, a data serialization format, based on JSON.
The schema is stored in the payload along with the data. This means we turn a Ruby Hash in to JSON and when deseriaized back to a Hash we get back the same value types as where in the original Hash. Unlike with the Ruby Marshall format the same can happen in other languages too.
Writing Avro
require 'avro'
schema = { "type": "record",
"name": "User",
"fields":
[
{"name": "name", "type": "string"},
{"name": "points", "type": "int"},
{"name": "winner", "type": "boolean", "default": "false"}
]
}.to_json
schema = Avro::Schema.parse(schema)
writer = Avro::IO::DatumWriter.new(schema)
buffer = StringIO.new
writer = Avro::DataFile::Writer.new(buffer, writer, schema)
writer << {"name" => "Sally", "points" => 25, "winner" => true}
writer.close # important
result = buffer.string
result # => "**Obj\u0001\u0004\u0014avro.codec\bnull\u0016avro.schema\xC2\u0002{\"type\":\"r**..."
Avro is a binary format.
Note that buffer
can be any IO object, e.g. a file.
Reading Avro
require 'avro'
buffer = StringIO.new(input)
dr = Avro::DataFile::Reader.new(buffer, Avro::IO::DatumReader.new)
data = dr.to_a # => [{ ... }]
The input
is the same as result
in the previous code segment.
You will get back a correctly typed Ruby Array of Hash.
If you want to work with Avro from the command line there is avro-tools
, which is installable with brew
(MacOS).
Check out this page for some examples.