FastAvro doesn’t work with Confluent AVRO and here’s how to get it working

Albert Wong
1 min readNov 22, 2024

--

You’ll get an error like Unknown magic byte when using the python FastAVRO library to send messages to a kafka topic.

Caused by: org.apache.kafka.common.errors.SerializationException: Unknown magic byte!
at io.confluent.kafka.serializers.AbstractKafkaSchemaSerDe.getByteBuffer(AbstractKafkaSchemaSerDe.java:244)
at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer$DeserializationContext.<init>(AbstractKafkaAvroDeserializer.java:334)
at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserialize(AbstractKafkaAvroDeserializer.java:151)
at org.apache.hudi.utilities.deser.KafkaAvroSchemaDeserializer.deserialize(KafkaAvroSchemaDeserializer.java:78)
at io.confluent.kafka.serializers.KafkaAvroDeserializer.deserialize(KafkaAvroDeserializer.java:53)
at org.apache.kafka.common.serialization.Deserializer.deserialize(Deserializer.java:60)
at org.apache.kafka.clients.consumer.internals.Fetcher.parseRecord(Fetcher.java:1426)
... 73 more

It took me a while and with the help of my coworker Sagar, we found this article https://blog.datachef.co/deserialzing-confluent-avro-record-kafka-spark?x-host=blog.datachef.co#revealing-the-confidential-confluent-avro-format, that explained to us that the kafka avro isn’t “just an avro file”. It’s special. It has 5 additional bytes added to the avro file. If you use libraries like https://github.com/wbarnha/kafka-python-ng, the serializer and deserilizer class already know how to parse through this kafka message. If you use FASTAVRO you have to construct this.

Check this out if you want a python kafka client that will connect with Kafka/MSK and AWS Glue Schema Registry. https://github.com/sagarlakshmipathy/python-avro-msk-glue-sr

--

--

Albert Wong
Albert Wong

Written by Albert Wong

#eCommerce #Java #Database #k8s #Automation. Hobbies: #BoardGames #Comics #Skeet #VideoGames #Pinball #Magic #YelpElite #Travel #Candy

No responses yet