FastAvro doesn’t work with Confluent AVRO and here’s how to get it workingYou’ll get an error like Unknown magic byte when using the python FastAVRO library to send messages to a kafka topic.Nov 22, 2024Nov 22, 2024
Connecting AWS Athena to Preset.ioYou would think that https://docs.preset.io/docs/aws-athena is enough to get it working, but it’s not.Nov 14, 2024Nov 14, 2024
Installing DataHub Core (docker compose) with personal tokensDataHub does not provide a free managed solution, so you need to install DataHub Core. DataHub Core is the open source version of DataHub…Nov 14, 20242Nov 14, 20242
Python Client for AWS MSK and AWS Glue Schema Registry and AVRO message payloadWhile Python is a powerful language for working with AWS MSK, there’s currently a gap in official documentation and examples for…Nov 13, 2024Nov 13, 2024
Python Client with AWS MSK and Confluent Schema Registry and AVRO message payloadDownload the official Python Client at https://github.com/confluentinc/confluent-kafka-python/blob/master/examples/avro_producer.py.Nov 13, 2024Nov 13, 2024
Creating and Connecting an AWS MSK cluster using SASLCreate an AWS Secret in AWS Secrets Manager (other type of secret, 2 key/value rows of username and password; need to create a new…Nov 6, 2024Nov 6, 2024
Installing Conduktor to connect to MSK on AWSTL-DR; There are instructions to do this manually, but it’s much, much easier to use the cloud formation template.Nov 5, 2024Nov 5, 2024
Debunking the Myth: Club Soccer Referees Are Independent ContractorsAre Club Soccer Referees Biased?Oct 3, 2024Oct 3, 2024
Universal data lakehouse: The most vendor/tool neutral data architecture.What is a Universal Data Lakehouse?Sep 30, 2024662Sep 30, 2024662
Is vendor and data consolidation enough to save money?Consolidation is a popular buzzword in the business world, and for good reason. On the surface, it seems logical. By reducing the number of…Sep 30, 2024Sep 30, 2024
Origin story for Apache Iceberg, Apache Hudi and Delta Lake with vendor control#Iceberg came from Netflix to support BI and dashboards. Designed for read heavy workloads (90% reads and 10% writes). Per…Sep 25, 20241Sep 25, 20241
A tool to search the California DMV website for available custom license plate namesWhile you can manually search for available license plate names on the DMV website at…Sep 11, 2024Sep 11, 2024
Data Lakehouse Readiness ScoreEasily assess your database’s data lakehouse compatibility!Sep 11, 2024Sep 11, 2024
Using the latest as of Sept 2024 (Spark 3 drivers) for DBeaverDBeaver ships with a very old version of Spark and Hadoop (spark 2 and hadoop 1.2) for the Hadoop / Spark Driver. Since it’s so old, I…Sep 11, 2024Sep 11, 2024
Apache Hudi with AWS EMR 7.2.0, AWS S3As Aug 13, 2024, the AWS Apache Hudi pages have not been updated with instructions on working with AWS EMR 7.2.0.Aug 13, 20245Aug 13, 20245
How do guest players for Club Soccer in SoCal (Southern California) for tournaments as of Aug 2024I’m the parent team manager for one of the Rangers FC South teams. Rangers FC South is a member of CalSouth and our players are members of…Aug 12, 2024Aug 12, 2024
Best Practices for Data Lakehouse IngestionA data lakehouse combines the best of both worlds: the scalability and cost-efficiency of a data lake with the reliability and governance…Aug 12, 20241051Aug 12, 20241051
Shameful: Two of Silicon Valley’s famous venture capitalists make the case for backing Trump: that…https://www.theverge.com/2024/7/24/24204706/marc-andreessen-ben-horowitz-a16z-trump-donationsAug 1, 2024Aug 1, 2024
Configuring Debezium and Apache Kafka with avro and Confluent Schema Registry (w/ auth enabled)Full example at https://github.com/alberttwong/onehouse-demos/tree/main/mysql-debeziumAug 1, 2024Aug 1, 2024