Tuesday 4 August 2015

Protocol Buffers

Protocol Buffers are a language-neutral, platform-neutral extensible mechanism for serializing structured data (official site).

In short protocol buffers are googles in house binary serialization format, designed to serialize small messages fast and with a small data size.

Where to use them
Protocol buffers are great when you need fast serialization and a small payload (where efficiency/speed is important). They are not so good when data formats are likely to change as this will mean you need to discard and recreate schemas and drop backwards compatibility or have multiple schemas for different versions.

Pros
  • Fast
  • Small Payload
  • Many cross platform implementations (c++, Java, C#, python, JS)
  • Generators a provided to generate data access classes
Cons
  • Schema Enforced (Need to define .proto file)
  • More complicated API than some other serializers

Proto Format
Protocol Buffer messages are first defined in a .proto file using the proto language. This can then be used to generate data access objects for your language of choice.
 message Boiler  
 {  
   required string BoilerId = 1;  
   required int32 TableNumber = 2;  
   required string TableName = 3;  
   required string BoilerData = 4;  
   required string BrandName = 5;  
   required string Qualifier = 6;  
   required string ModelName = 7;  
 }  
Fields are described with options (required, optional, repeated, etc..) followed by Type (int32, int64, string), field name then = and an Id number that must be unique. For more information read the v2 spec.

Proto3
The current version is proto2, but Proto3 is current in alpha and being developed (Source code on GitHub) The update will include a new version of the .proto syntax.

Binary vs Text Serialization
 {  
  "Id": "21",  
  "Name": "Test User",  
  "address": {  
   "streetAddress": "Test Street",  
   "city": "Birmingham"  
  }  
 }  
Some formats serialize to text (JSON, XML, etc..) this can be really useful if you need to quickly view and edit the documents without special tooling. Binary serializations will require a tool that can read the document before you can begin to read and manipulate the data. Binary serializations will normally be smaller and more efficient though.

References
https://developers.google.com/protocol-buffers/
https://github.com/google/protobuf
https://developers.google.com/protocol-buffers/docs/reference/proto2-spec

No comments:

Post a Comment