Installing DataHub Core (docker compose) with personal tokens

Albert Wong
5 min readNov 14, 2024

--

DataHub does not provide a free managed solution, so you need to install DataHub Core. DataHub Core is the open source version of DataHub. By default, the docker compose quickstart doesn’t enable token-based authentication. Here are the steps to enable it.

Docker doesn’t officially support Docker. If you want a better working solution, use Ubuntu or Debian.

  1. Create EC2 Amazon Linux 2023 instance with xlarge on a public network.
  2. Install docker and compose compose. https://medium.com/@fredmanre/how-to-configure-docker-docker-compose-in-aws-ec2-amazon-linux-2023-ami-ab4d10b2bcdc
  ,     #_
~\_ ####_ Amazon Linux 2023
~~ \_#####\
~~ \###|
~~ \#/ ___ https://aws.amazon.com/linux/amazon-linux-2023
~~ V~' '->
~~~ /
~~._. _/
_/ _/
_/m/'
[ec2-user@ip-172-31-21-123 ~]$ sudo yum install docker -y
Last metadata expiration check: 0:00:41 ago on Mon Nov 18 17:05:45 2024.
Dependencies resolved.
=========================================================================================================================================================================================================================================================================
Package Architecture Version Repository Size
=========================================================================================================================================================================================================================================================================
Installing:
docker x86_64 25.0.6-1.amzn2023.0.2 amazonlinux 44 M
Installing dependencies:
containerd x86_64 1.7.23-1.amzn2023.0.1 amazonlinux 36 M
iptables-libs x86_64 1.8.8-3.amzn2023.0.2 amazonlinux 401 k
iptables-nft x86_64 1.8.8-3.amzn2023.0.2 amazonlinux 183 k
libcgroup x86_64 3.0-1.amzn2023.0.1 amazonlinux 75 k
libnetfilter_conntrack x86_64 1.0.8-2.amzn2023.0.2 amazonlinux 58 k
libnfnetlink x86_64 1.0.1-19.amzn2023.0.2 amazonlinux 30 k
libnftnl x86_64 1.2.2-2.amzn2023.0.2 amazonlinux 84 k
pigz x86_64 2.5-1.amzn2023.0.3 amazonlinux 83 k
runc x86_64 1.1.14-1.amzn2023.0.1 amazonlinux 3.2 M

Transaction Summary
=========================================================================================================================================================================================================================================================================
Install 10 Packages

Total download size: 84 M
Installed size: 317 M
Downloading Packages:
(1/10): iptables-libs-1.8.8-3.amzn2023.0.2.x86_64.rpm 2.9 MB/s | 401 kB 00:00
(2/10): iptables-nft-1.8.8-3.amzn2023.0.2.x86_64.rpm 4.3 MB/s | 183 kB 00:00
(3/10): libcgroup-3.0-1.amzn2023.0.1.x86_64.rpm 2.0 MB/s | 75 kB 00:00
(4/10): libnetfilter_conntrack-1.0.8-2.amzn2023.0.2.x86_64.rpm 2.9 MB/s | 58 kB 00:00
(5/10): libnfnetlink-1.0.1-19.amzn2023.0.2.x86_64.rpm 1.5 MB/s | 30 kB 00:00
(6/10): libnftnl-1.2.2-2.amzn2023.0.2.x86_64.rpm 4.0 MB/s | 84 kB 00:00
(7/10): pigz-2.5-1.amzn2023.0.3.x86_64.rpm 2.4 MB/s | 83 kB 00:00
(8/10): runc-1.1.14-1.amzn2023.0.1.x86_64.rpm 20 MB/s | 3.2 MB 00:00
(9/10): docker-25.0.6-1.amzn2023.0.2.x86_64.rpm 48 MB/s | 44 MB 00:00
(10/10): containerd-1.7.23-1.amzn2023.0.1.x86_64.rpm 32 MB/s | 36 MB 00:01
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total 71 MB/s | 84 MB 00:01
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Preparing : 1/1
Installing : runc-1.1.14-1.amzn2023.0.1.x86_64 1/10
Installing : containerd-1.7.23-1.amzn2023.0.1.x86_64 2/10
Running scriptlet: containerd-1.7.23-1.amzn2023.0.1.x86_64 2/10
Installing : pigz-2.5-1.amzn2023.0.3.x86_64 3/10
Installing : libnftnl-1.2.2-2.amzn2023.0.2.x86_64 4/10
Installing : libnfnetlink-1.0.1-19.amzn2023.0.2.x86_64 5/10
Installing : libnetfilter_conntrack-1.0.8-2.amzn2023.0.2.x86_64 6/10
Installing : iptables-libs-1.8.8-3.amzn2023.0.2.x86_64 7/10
Installing : iptables-nft-1.8.8-3.amzn2023.0.2.x86_64 8/10
Running scriptlet: iptables-nft-1.8.8-3.amzn2023.0.2.x86_64 8/10
Installing : libcgroup-3.0-1.amzn2023.0.1.x86_64 9/10
Running scriptlet: docker-25.0.6-1.amzn2023.0.2.x86_64 10/10
Installing : docker-25.0.6-1.amzn2023.0.2.x86_64 10/10
Running scriptlet: docker-25.0.6-1.amzn2023.0.2.x86_64 10/10
Created symlink /etc/systemd/system/sockets.target.wants/docker.socket → /usr/lib/systemd/system/docker.socket.

Verifying : containerd-1.7.23-1.amzn2023.0.1.x86_64 1/10
Verifying : docker-25.0.6-1.amzn2023.0.2.x86_64 2/10
Verifying : iptables-libs-1.8.8-3.amzn2023.0.2.x86_64 3/10
Verifying : iptables-nft-1.8.8-3.amzn2023.0.2.x86_64 4/10
Verifying : libcgroup-3.0-1.amzn2023.0.1.x86_64 5/10
Verifying : libnetfilter_conntrack-1.0.8-2.amzn2023.0.2.x86_64 6/10
Verifying : libnfnetlink-1.0.1-19.amzn2023.0.2.x86_64 7/10
Verifying : libnftnl-1.2.2-2.amzn2023.0.2.x86_64 8/10
Verifying : pigz-2.5-1.amzn2023.0.3.x86_64 9/10
Verifying : runc-1.1.14-1.amzn2023.0.1.x86_64 10/10

Installed:
containerd-1.7.23-1.amzn2023.0.1.x86_64 docker-25.0.6-1.amzn2023.0.2.x86_64 iptables-libs-1.8.8-3.amzn2023.0.2.x86_64 iptables-nft-1.8.8-3.amzn2023.0.2.x86_64 libcgroup-3.0-1.amzn2023.0.1.x86_64 libnetfilter_conntrack-1.0.8-2.amzn2023.0.2.x86_64
libnfnetlink-1.0.1-19.amzn2023.0.2.x86_64 libnftnl-1.2.2-2.amzn2023.0.2.x86_64 pigz-2.5-1.amzn2023.0.3.x86_64 runc-1.1.14-1.amzn2023.0.1.x86_64

Complete!
[ec2-user@ip-172-31-21-123 ~]$ sudo service docker start
Redirecting to /bin/systemctl start docker.service
[ec2-user@ip-172-31-21-123 ~]$ sudo curl -L https://github.com/docker/compose/releases/latest/download/docker-compose-$(uname -s)-$(uname -m) -o /usr/local/bin/docker-compose
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 61.0M 100 61.0M 0 0 53.8M 0 0:00:01 0:00:01 --:--:-- 109M
[ec2-user@ip-172-31-21-123 ~]$ sudo chmod +x /usr/local/bin/docker-compose
[ec2-user@ip-172-31-21-123 ~]$ docker-compose version
Docker Compose version v2.30.3

additional instructions

[ec2-user@ip-172-31-23-6 ~]$ sudo chkconfig docker on
Note: Forwarding request to 'systemctl enable docker.service'.
Created symlink /etc/systemd/system/multi-user.target.wants/docker.service → /usr/lib/systemd/system/docker.service.
sudo usermod -a -G docker ec2-user

restart and you can check if everything works by running the docker ps command.

ubuntu@ip-172-31-30-229:~$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
[ec2-user@ip-172-31-23-6 ~]$ docker ps -a
permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.44/containers/json?all=1": dial unix /var/run/docker.sock: connect: permission denied

3. Install python3 and python3-pip and python3-virtualenv

[ec2-user@ip-172-31-21-123 ~]$ sudo yum install python
Last metadata expiration check: 0:11:08 ago on Mon Nov 18 17:05:45 2024.
Dependencies resolved.
=========================================================================================================================================================================================================================================================================
Package Architecture Version Repository Size
=========================================================================================================================================================================================================================================================================
Installing:
python-unversioned-command noarch 3.9.16-1.amzn2023.0.9 amazonlinux 10 k

Transaction Summary
=========================================================================================================================================================================================================================================================================
Install 1 Package

Total download size: 10 k
Installed size: 23
Is this ok [y/N]: y
Downloading Packages:
python-unversioned-command-3.9.16-1.amzn2023.0.9.noarch.rpm 176 kB/s | 10 kB 00:00
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total 75 kB/s | 10 kB 00:00
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Preparing : 1/1
Installing : python-unversioned-command-3.9.16-1.amzn2023.0.9.noarch 1/1
Running scriptlet: python-unversioned-command-3.9.16-1.amzn2023.0.9.noarch 1/1
Verifying : python-unversioned-command-3.9.16-1.amzn2023.0.9.noarch 1/1

Installed:
python-unversioned-command-3.9.16-1.amzn2023.0.9.noarch

Complete!
[ec2-user@ip-172-31-21-123 ~]$ sudo yum install python3-pip
Last metadata expiration check: 0:11:28 ago on Mon Nov 18 17:05:45 2024.
Dependencies resolved.
=========================================================================================================================================================================================================================================================================
Package Architecture Version Repository Size
=========================================================================================================================================================================================================================================================================
Installing:
python3-pip noarch 21.3.1-2.amzn2023.0.9 amazonlinux 1.8 M
Installing weak dependencies:
libxcrypt-compat x86_64 4.4.33-7.amzn2023 amazonlinux 92 k

Transaction Summary
=========================================================================================================================================================================================================================================================================
Install 2 Packages

Total download size: 1.9 M
Installed size: 11 M
Is this ok [y/N]: y
Downloading Packages:
(1/2): libxcrypt-compat-4.4.33-7.amzn2023.x86_64.rpm 1.1 MB/s | 92 kB 00:00
(2/2): python3-pip-21.3.1-2.amzn2023.0.9.noarch.rpm 19 MB/s | 1.8 MB 00:00
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total 12 MB/s | 1.9 MB 00:00
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Preparing : 1/1
Installing : libxcrypt-compat-4.4.33-7.amzn2023.x86_64 1/2
Installing : python3-pip-21.3.1-2.amzn2023.0.9.noarch 2/2
Running scriptlet: python3-pip-21.3.1-2.amzn2023.0.9.noarch 2/2
Verifying : libxcrypt-compat-4.4.33-7.amzn2023.x86_64 1/2
Verifying : python3-pip-21.3.1-2.amzn2023.0.9.noarch 2/2

Installed:
libxcrypt-compat-4.4.33-7.amzn2023.x86_64 python3-pip-21.3.1-2.amzn2023.0.9.noarch

Complete!

start the DataHub install by creating a venv

python3 -m venv datahub
source datahub/bin/activate

4. Install DataHub Core using the Quickstart Guide at https://datahubproject.io/docs/quickstart

5. Enable the token based authentication and generate a token. More information can be found at https://datahubproject.io/docs/authentication/.

# Run this command before starting the service
export METADATA_SERVICE_AUTH_ENABLED=true

6. Start the service

datahub docker quickstart
datahub docker quickstart --stop

if you get an error with downloading the yml, manually download the yml with wget and run

datahub docker quickstart --quickstart-compose-file docker-compose-without-neo4j.quickstart.yml 
datahub docker quickstart --quickstart-compose-file docker-compose-without-neo4j.quickstart.yml --stop

7. Whitelist ports 9002 for the UI and 8080 for the REST API in the AWS Security Group.

For a more “production"-like” environment, I suggest you look at the AWS or other guides at https://datahubproject.io/docs/deploy/aws/.

--

--

Albert Wong
Albert Wong

Written by Albert Wong

#eCommerce #Java #Database #k8s #Automation. Hobbies: #BoardGames #Comics #Skeet #VideoGames #Pinball #Magic #YelpElite #Travel #Candy

No responses yet