All posts in " strimzi "

Back to Engineering Vol. 2: A Path to Wisdom

By Aykut Bulgu / August 21, 2023

A Path to “Wisdom”

I have been a Java developer for about 17 years already, starting with Java 1.4 and Struts 1.2 and JSP-related stuff. I had a chance to use many other frameworks such as Spring, JSF, Spring Boot, and as a web developer, I enjoyed developing Java for web programming. This was the part before I joined Red Hat.

When I was a consultant at Red Hat, I mainly used the Java stack as the Red Hat Middleware was mainly written with Java stack, and Red Hat application runtimes consist of mainly Java-related technologies such as Spring Boot, Vert.x, and Quarkus.

My Java journey didn’t end with my moving to the Red Hat training team, where I created applications mostly again with Java. I was in the AppDev team and as I said previously, Java was one of the main languages that are used in application development on the Red Hat technologies.

It was about 9 months ago that I wanted to get serious with Java development under the “engineering” title again by moving to the Red Hat AMQ Streams/Kafka team. I truly enjoy Kafka as it has been my technical niche for years. However, as the Rolling Stones say in their song (and as it is re-stated a couple of times in the House M.D. series): “you can’t always get what you want”.

In your -engineering- career, there are times when you should take a decision on what to do next; your path.

It can be a new company or a new team. It can be a new city or a country. Moreover, it can be a new language or a new framework. Maybe you are doing a total technology stack change by choosing a new path. Sometimes it is the time to learn new things and grow your career and improve yourself.

The Wisdom Move

A month ago or so, an internal opportunity happened and I decided to move to another team within Red Hat -again as an engineer.

I am moving to the Ansible Lightspeed (formerly Wisdom) team as of August 21!

Ansible has been one of the technologies that I’ve been interested in for some time, especially when I was a Middleware Consultant at Red Hat. I remember I got a lot of help from Ansible, for automating things like installing and setting up a Kafka cluster for a few customers. Now it is time for a giveback to Ansible I guess:) Therefore, I am happy to be a part of the Ansible Lightspeed/Wisdom team.

For this move, I am switching my development stack from Java to Python (and Django).

Python has been one of the languages that I am interested in for some time. I developed the Strimzi Kafka CLI with Python and I have no regret using it as Python is one of the best languages for developing a CLI. Apart from my CLI development experience with Python, I have some Django experience. I had a chance to develop a few Django services for World Health Organization for a project called “WHO Academy” when I was working with the Red Hat Open Innovation Labs team for them. While you have a main stack and you just touch other languages, it is easy, but since this is a stack change, as Master Yoda says, I might need to unlearn some things to learn new things as I am putting something new onto my 15+ years of experience.

Answers for the Mere Mortals

This move will cause many questions in the brilliant minds of the folks who already know me for sure. Let me answer some of them upfront.


Q: Why Python? Java is a more enterprise language(!) than Python, so why choose Python?

Wrong. Python has an easy start for anyone but it is also widely used in many enterprise areas. Python is widely used in machine learning and the Ansible project itself is written with Python.

Also, being a fan of a language or technology is wrong. You can love them, but you mustn’t be a fan of them. Lately, I like Python a bit more than Java, because of a few reasons, but I am not a fan of either of them.

As a Software Crafter, a software programming language doesn’t matter to me as it is just a tool. Tools can be switched, changed, or updated over time, depending on the work and circumstances.

Q: Don’t you think that this is a career suicide?

No. For the recruiters or similar professions maybe yes, but to me it is not. Because having more than one tool in your pocket is always an advantage. Learning different things can make you gain different perspectives on things so these kinds of moves are far from being a career suicide.

Q: What about Java/Jakarta EE/Quarkus? Will you be still contributing to the communities by creating content, giving speeches, and doing workshops?

Absolutely! I will continue to write about Java and Quarkus-related stuff, create videos and workshops and speak at Java/Jakarta EE conferences.

Q: What about Apache Kafka?

I like Kafka and it’s my niche technology that I’ve been specializing in. I cannot and will not stop creating content for it. Even if I am moving from the Strimzi team, I will continue to work on my Strimzi CLI project, continue to write about it, and record videos for it. There are many articles and related video content in my queue to be created and released. So stay tuned!

Q: Any plans on writing about Ansible/Ansible Lightspeed here?

Ansible will be my second niche for sure so why not! I like the technology and its control on any system so thinking this site’s name is “System Craftsman”, that would be a pity not to include Ansible-related content here. This can be thought of as the System Craftsmanship: Software Craftsmanship in the Cloud Native Era article’s theory can come into practice from now on.

Do you have some other questions? Feel free to ask me by using the comment section below, or just wish me luck with my new journey in lightspeed:)

May the Force be with you!

Continue reading >
Share

Back to Engineering: An engineer’s gainings through time

By Aykut Bulgu / November 29, 2022

EDIT: As of August 21, I am joining the Ansible Lightspeed engineering team. Stay tuned for another possible article about it!

EDIT2: Here is the new article: Back to Engineering Vol. 2: A Path to Wisdom

TL;DR: As of December 1st, I am starting a new job in Red Hat, this time as a Principal Software Engineer in the Kafka/Strimzi team. If you are curious about my Red Hat journey so far, read the rest:)

It’s been almost 5 years in Red Hat. I started my journey in Red Hat as a middleware consultant. After about 2.5 years I moved to the Red Hat Training team and have been working here as a Senior Content Architect, till now. Because it has changed recently. I am moving back to software engineering!

Before Red Hat, I was a Senior Software Engineer at Sahibinden.com, one of the highest-traffic-consuming websites in Turkey. Apart from some Business Intelligence journey or coding in .Net and PHP, most of my career consists of writing code in Java. So I called myself a “Java developer” most of the time and developed web application backends with several types of frameworks such as Struts 1.2 (yeah I am that old), JBoss Seam Framework, and Spring Framework (not Spring Boot!).

At some point, in the company I was working at, I felt that I am doing similar things in similar loops so I was in a kind of vicious cycle that I had to break. This way I wouldn’t be able to improve myself to be better in my career so I decided to learn some “new stuff” in my spare time in a scheduled way. Cloud, Kubernetes, and cloud-native were popular terms back then and I started to teach myself in very small chunks (The Kaizen Way) and tried not to break the chain.

Anyway, as a developer who was not much aware of the new cloud-native world, I learned about AWS, Kubernetes, OpenShift, and Fabric8, which was a Java library for Kubernetes.

It was early 2018, I remember reaching out to my Red Hat software engineer friend, Ali Ok, for an open position in Red Hat, asking if he can be a referral or not (Red Hat has a great job referral mechanism, which is called “Ambassador Program” and they value referrals a lot). The position was for a “Middleware/AppDev Consultant”, which requires the applicant to take care of the customer deliveries for Red Hat middleware, cloud-native apps, and most importantly: OpenShift CI/CD!

I don’t want to expand this part of the story too much but I joined a couple of interviews (including Java code interviews via Hackerrank at that time) and finally got the job and started at Red Hat as a Middleware/AppDev Consultant.

The Consultancy

I spent my ~2.5 years as a Middleware/AppDev Consultant, worked with about 20 different customers, and traveled to many countries including Senegal (West Africa), learned and taught a lot of Red Hat technologies such as Apache Camel (Fuse), Infinispan (Datagrid), Jenkins for OpenShift CI and ArgoCD for OpenShift CD, Quarkus (pretty new), and most importantly a technology that I am in love of; Apache Kafka. My Kafka journey did not start with Red Hat (as I knew a bit about Kafka from my previous job), but it became my “niche” technology in Red Hat. (For the curious ones: This presentation includes my short story with Kafka.)

Working with customers can be hard because their requests might vary, and communication and the way of working can vary depending on the people’s personalities (like everywhere) or the company culture.

As a software engineer, whose job was to just write code and almost only have to talk in the meetings, it was really hard for me initially because you could have “randomness” in that area, which makes the processes “unpredictable”.

An example of an unpredictable situation that I remember of is, once I contacted a customer to arrange a meeting and we arranged the meeting for a start of delivery (delivery here means the consultancy work/labor for the customer). There were only 3 or 4 people in the meeting request in my calendar so I was very relieved as I was not a good public speaker back then, so doing things among 3-4 people always meant it would be a small meeting in a small room. Easy start! I prepared only my hardware in relief and went to bed for the next day.

I went to the customer’s office the next morning, met the person who welcomed me to their company and we headed to the meeting room. When we came by the door of the meeting room, because of the external view I suspected that it was a big room. Well, sometimes we had small meetings in big meeting rooms, so that was something I was used to and that had to be a similar case.

He opened the door of the meeting room and there was a big long desk full of people around it (I could not count but it must be more than 40), waiting for me and my unprepared presentation as an intro for development processes with OpenShift.

Well, of course, I did not say “sorry but I didn’t prepare the presentation” or “I can’t show you anything today”, but sat down and requested about 5 minutes to be “prepared”. What do you think I did after this?

Well luckily, my dear colleague Cenk Kulacoglu, who is now the manager of the Red Hat Consultancy team in Turkey (he was a senior architect back then), had taught me to be prepared in any time. I had seen him he always had a couple of presentations in his Google Drive so that he could either present them or change them regarding the customer needs. That was what I did for that customer.

I told you about this story because being a consultant, not only makes you improve yourself technically but also improves your soft skills so that you can give more to your clients.

In consultancy, you can not only learn how to be “always being prepared” to overcome the “unpredicted situations”, but also many other skills that I think improved me in many ways:

  • Public speaking: I was shy about speaking in public probably because of a couple of traumas I had in my childhood. I had a chance to overcome this with the “unpredicted situations” many times. When you expect to see 3 people but learn that you have to present to 40, it improves you. Now I am pretty relaxed talking in front of public.
  • Presentation skills: I learned a lot from my craftership master Lemi Orhan Ergin in the past, and I think I had a chance to evolve it in my way with the style that I learned by just preparing slides. The key part is the “flow” and you learn it by doing. And if there are times that you have to do it in just 30 minutes to prepare a deck, then you learn it better:) (This is another story that I will not give any detail about.)
  • Presenting skills: A presentation is not only about slides, but you have to learn how to explain the topic to the audience, especially if they are your customers, you have to do it better because they want to learn from you and they are consulting with you. You must distill the knowledge and present it with the tools (presentation) and your public speaking abilities.
  • Persuasion/Negotiation skills: Well sometimes you have to convince the people/customer you work with. When you are a consultant, you are the one who is consulted so you might need to use your persuasion or negotiation skills from time to time.
  • Fast learning: You must learn things faster in the moving environment of consultancy because you will always be expected to know things, especially if that things belong to a wide group, such as middleware or CI/CD.
  • Flexible attitude (Politics skills): This is not about any politician or what they do but this is more about managing yourself and your clients. You must be always nice to the customers and should not be too direct. Sometimes there will be misunderstandings, or sometimes there can be really tough customers, you have to manage them all so you should not have one and only one solution for the same thing but many. This is not about a “customer is always right” thing, but about using your persuasion, negotiation, and “niceness” skills and being flexible as much as you can. This is one of the things that I learned very later on.
  • And many more that I might not remember… You can add your own in the comments section below.
WHO Built This

Just before finalizing my days as a consultant, I had a chance to work with a very special team; Red Hat Open Innovation Labs team, on a very special project; the World Health Organization (WHO) Learning Experience Platform (LXP), at a very special (and weird) time of the year; just after the pandemic happened!

It was a great experience with the WHO team, trying to implement and stabilize a real DevOps culture within their organization and creating a functioning application for their LXP project in just 8 weeks.

Enough for the words for this, for the curious mere mortals, here is how we did it:

Gaining these kinds of experiences as a “Software Engineer” or consultant makes you think differently. Being physically in a customer environment is one thing, having a kind of remote customer engagement, which is worldwide, timezone dependant, and important (because it is directly about health in a pandemic time), is another thing.

After my ~2.5 years of consultancy journey, because of having a small taste of working in a remote worldwide environment, and a bit of love for writing and creating content to teach people, I decided to move to another team, where I can access these benefits and improve myself on creating training content. I moved to the Red Hat Training (internally aka. Global Learning Services – GLS) team.

The “Content” Development

In February 2021 I joined the Red Hat training team, with the enthusiasm and excitement of having a chance to create professional content. I would write books for training and I was very excited about that.

Before being a consultant, I had been admiringly looking at the techie people who created content consistently. Some consistently wrote blog posts, others created YouTube content.

Especially the writing part took a lot of interest in me because if think if I did not become a “techie” I would be a writer; a novel writer. I remember I left similar notes to every close friend of mine in the high school diary book like “…in the future it is likely you find me in a book…”, giving the message to both their subconscious and mine. It is kind of childish but I have to admit that I liked writing because I had been writing short stories and poems at that time (poems in both English and Turkish. Edgar Allan Poe was -and still is- my favorite).

After years, in university, I lost my interest in writing and things went more with Maths and 0-1s but while trying to learn things on the web, I realized writing (actually creating content) is still a thing, which will not be rotten away. I tried to create some forums first, wrote one by myself with PHP, then used vBulettin and similar things but end up creating the site and writing a couple of posts or two then forget the website.

Later on, I understood that it is not a matter of quality, but quantity and consistency. This is another topic though, so I am leaving this for another article.

While I was working on the “Strimzi Kafka CLI”, writing blog posts about it, and creating video content, I always thought about writing better. If you are a software engineer who has this mindset that seeks a spec for everything technical, you also seek a spec for everything in the real world, like written content. I knew that my writing was not good (and still not perfect), but I probably sensed that there should be a better way to do that with a spec or something.

There are books around about creating content better such as Everybody Writes by Ann Handley and On Writing: A Memoir of the Craft by Stephen King, I bought both of them before I joined the training team and read both but it was the time I really learned about writing: writing, reviewing and get your content reviewed by your colleagues and the editors.

I learned that there is really a spec like developers have for their programming languages or frameworks. We used the Red Hat Style Guide and IBM Style Guide, which guided us a lot while we write. I think this is the actual process you really learn doing something. While I might be still chased by zombies of passive voice, I think my writing got improved a bit, good enough for helping a content creation agency with creating high-quality content for their customers. I am going to expand on this part later on in a different post about “Writing for Developers”.

Training the Experts

Working in a training team not only gives you a chance to improve your writing skills, but also you become a trainer by thinking from the trainee’s perspective.

We had a lot of discussions about how to present the content to the students, and how to write better quizzes, labs, and guided exercises, we even tried out different models of instructions so that we could get feedback about those and improve our end.

Training content is not like blog post content so you must think about its architecture (because it is actually a book), you have to have inclusive language, your style should be formal, you have to be clear with your instructions, and so on. Most importantly, because it is teamwork, you have to know how to work together and how to connect the dots of yours and other people’s because, in the end, it is a whole content that customers and Red Hat trainers use.

In the period that I’ve been involved in, I had a chance to work on many training contents including OpenShift CI/CD and OpenShift Serverless.

Here is a short list of the training courses we created during my tenure:

While I contributed writing content for these courses, I had a chance to use my connections from consultancy to reach out to people who are experts on those topics so we could get a lot of useful feedback before, during, and after the course development.

I can say that the Kafka one is my favorite one as I not only contributed to its content but also helped shape the outline of this course. It was the first course that I had a chance to start from scratch so I can admit I had a lot of fun. (Also thanks to my friends Jaime and Pablo who made this process fun for me. Long live the 3 Musketeers!)

I also had a chance to do a few Taste of Training (ToT) webinars where we demonstrated a training-relevant technology and did announcements about our present or future courses. You can find one about “Openshift Serverless” here still live on the Red Hat Webinars page, and another one about “Building event-driven systems with structure” is here on my YouTube channel:

In the training team, I had many chances to use my previous skills like presentation, people connections, fast learning, and product information, and learned new ones such as writing and training mechanics. Knowing how to write, as a developer, helped me a lot because when you write, you learn how to express your opinions better and you become more organized. Also if you start writing here and there, it is inevitable that you get some attraction which helps your self-marketing. I don’t write on popular technical content websites much but as I mentioned before, in my free time, I help a developer content creation agency where I can hone my writing or outlining skills.

It has been fun to work as a Content Architect at Red Hat but after some time I felt like I should focus on a single technology and work on it in a lower-level manner. Since I have been very much interested in Kafka and Strimzi for some time, I took a chance to move to the Red Hat Kafka engineering team, and luckily I was able to become a part of that team, which consists of very smart, talented and friendly people that I already know of!

The New Journey: Red Hat Kafka Engineering

As I mentioned at the very beginning of the article, it has been quite a time since I haven’t been doing R&D and writing low-ish-level code. I mean I wrote a lot of code for customer projects when I was a consultant, I wrote code for the guided exercises and labs for the training team, and wrote code for my open-source pet projects but writing the code for the product itself is a different thing. It is something that I haven’t been “directly” doing for 4+ years so I have to admit that I am a bit nervous and excited about it.

However, hopefully, I will have a chance to use my gainings from other teams and past experiences from them, and waking up the “software engineer” inside me with those past gainings, I believe, will create a beneficial combination for the team.

I am a lot excited that I will have a chance to do direct contributions to the Strimzi and AMQ Streams. I know that it is not like contributing from the outside because when you become a member, expectations become higher:)

I am starting my Principal Sofware Engineer role in the Kafka/Strimzi engineering team as of December 1st, so wish me luck with my new journey in Red Hat:)

Continue reading >
Share

Change Data Capture with CockroachDB and Strimzi

Dealing with data is hard in the Cloud-native era, as the cloud itself has many dynamics, and microservices have their unique data-relevant problems. As many patterns emerge after dealing with similar problems over and over, technologists invented many data integration patterns to solve those data-related problems.

Change Data Capture (CDC) is one of those data integration patterns, maybe one of the most popular ones nowadays. It enables capturing row-level changes into a configurable sink for downstream processing such as reporting, caching, full-text indexing, or most importantly helping with avoiding dual writes.

Many technologies implement CDC as either a part of their product’s solution or the main functionality of an open-source project like Debezium.

CockroachDB is one of those technologies that implement CDC as a part of their product features.

CockroachDB is a distributed database that supports standard SQL. It is designed to survive software and hardware failures, from server restarts to data center outages.

While CockroachDB is designed to be excellent, it needs to coexist with other systems like full-text search engines, analytics engines or data pipeline systems. Because of that, it has Changefeeds, which enable data sinks like AWS S3, webhooks and most importantly Apache Kafka or Strimzi: a Kafka on Kubernetes solution.

Strimzi is a CNCF sandbox project, which provides an easy way to run and manage an Apache Kafka cluster on Kubernetes or OpenShift. Strimzi is a Kubernetes Operator, and it provides an easy and flexible configuration of a Kafka cluster, empowered by the capabilities of Kubernetes/OpenShift.

In this tutorial, you will:

  • Run a Kafka cluster on OpenShift/Kubernetes.
  • Create a topic within Strimzi by using its the Strimzi CLI.
  • Create a CockroachDB cluster on OpenShift and use its SQL client.
  • Create a table on CockroachDB and configure it for using CDC.
  • Create, update, delete records in the CockroachDB table.
  • Consume the change events from the relevant Strimzi topic by using the Strimzi CLI.

Prerequisites

You’ll need the following for this tutorial:

  • oc CLI.
  • Strimzi CLI.
  • A 30-days trial license for CockroachDB, which is required to use CockroachDB’s CDC feature.
  • You must have Strimzi 0.26.1 or AMQ Streams 2.1.0 operator installed on your OpenShift/Kubernetes cluster.
  • You must have CockroachDB 2.3.0 operator installed on your OpenShift/Kubernetes cluster.

CockroachBank’s Request

Suppose that you are a consultant who works with a customer, CockroachBank LLC.

They use CockroachDB on OpenShift and their daily bank account transactions are kept in CockroachDB. Currently, they have a mechanism for indexing the bank account transaction changes in Elasticsearch but they noticed that it creates data inconsistencies between the actual data and the indexed log data that is in Elasticsearch.

They want you to create a core mechanism that avoids any data inconsistency issue between systems. They require you to create a basic implementation of a CDC by using CockroachDB’s Changefeed mechanism and because they use CockroachDB on OpenShift, they would like you to use Strimzi.

You must only implement the CDC part, so do not need to implement the Elasticsearch part. To simulate the CockroachBank system, you must install the AMQ Streams and CockroachDB operators on your OpenShift cluster and create their instances.

The following image is the architectural diagram of the system they require you to implement:

Required Architecture

Creating a Strimzi Cluster

To create a Strimzi cluster, you need a namespace/project created on your OpenShift/Kubernetes cluster. You can use the oc CLI to create a namespace/project.

oc new-project cockroachbank

Run the kfk command of Strimzi CLI to create a Kafka cluster with one broker. A small cluster with one broker is enough for a demo.

export STRIMZI_KAFKA_CLI_STRIMZI_VERSION=0.26.1 && \
kfk clusters --create \
--cluster my-cluster \
--replicas 1 \
--zk-replicas 1 \
-n cockroachbank -y

NOTE

You can download the Strimzi CLI Cheat Sheet from here for more information about the CLI commands.


Verify that you have all the Strimzi pods up and running in Ready state.

oc get pods -n cockroachbank

The output should be as follows:

NAME                                          READY   STATUS    ...
...output omitted...
my-cluster-entity-operator-64b8dcf7f6-cmjz8   3/3     Running   ...
my-cluster-kafka-0                            1/1     Running   ...
my-cluster-zookeeper-0                        1/1     Running   ...
...output omitted...

Running CockroachDB on OpenShift

To install CockroachDB on your OpenShift cluster, download the CockroachDB cluster custom resource, which the CockroachDB operator creates the cluster for you by using the resource definition you provide in the YAML.

The YAML content should be as follows:

kind: CrdbCluster
apiVersion: crdb.cockroachlabs.com/v1alpha1
metadata:
  name: crdb-tls-example
  namespace: cockroachbank
spec:
  cockroachDBVersion: v21.1.10
  dataStore:
    pvc:
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 10Gi
        volumeMode: Filesystem
  nodes: 3
  tlsEnabled: false

Run the following command to apply the resource YAML on OpenShift.

oc create -f crdb-cluster.yaml -n cockroachbank

Verify that you have all the CockroachDB pods up and running in Ready state.

oc get pods -n cockroachbank

The output should be as follows:

NAME                                          READY   STATUS    ...
...output omitted...
crdb-tls-example-0                            1/1     Running   ...
crdb-tls-example-1                            1/1     Running   ...
crdb-tls-example-2                            1/1     Running   ...
...output omitted...

In another terminal run the following command to access CockroachDB SQL client interface:

oc rsh -n cockroachbank crdb-tls-example-0 cockroach sql --insecure

On the SQL client interface, run the following commands to enable the trial for enterprise usage. You must do this because CDC is a part of the Enterprise Changefeeds and you need an enterprise license. Refer to the Prerequisites part of this tutorial if you did not get a trial license yet.

SET CLUSTER SETTING cluster.organization = '_YOUR_ORGANIZATION_';
SET CLUSTER SETTING enterprise.license = '_YOUR_LICENSE_';

Creating and Configuring a CockroachDB Table

On the SQL query client terminal window, run the following command to create a database called bank on CockroachDB.

root@:26257/defaultdb> CREATE DATABASE bank;

Select the bank database to use for the rest of the actions in the query window.

root@:26257/defaultdb> USE bank;

Create a table called account with the id and balance fields.

root@:26257/bank> CREATE TABLE account (id INT PRIMARY KEY, balance INT);

Create a Changefeed for the table account. Set the Strimzi Kafka cluster broker address for the Changefeed, to specify the broker to send the captured change data:

root@:26257/bank> CREATE CHANGEFEED FOR TABLE account INTO 'kafka://my-cluster-kafka-bootstrap:9092' WITH UPDATED;

Notice that the Kafka bootstrap address is my-cluster-kafka-bootstrap:9092. This is the service URL that Strimzi provides for the Kafka cluster my-cluster you have created.


NOTE

For more information on creating a Changefeed on CockroachDB refer to this documentation page.


Leave the terminal window open for further instructions.

Creating a Strimzi Topic and Consuming Data

In another terminal window run the following command to create a topic called account:

kfk topics --create --topic account \
--partitions 1 --replication-factor 1 \
-c my-cluster -n cockroachbank

The output must be as follows:

kafkatopic.kafka.strimzi.io/account created

IMPORTANT

A topic with 1 partition and 1 replication factor is enough for this demonstration. If you have a Strimzi Kafka cluster with more than 1 broker, then you can configure your topic differently.


Notice that it is the same name as the CockroachDB table account. CockroachDB CDC produces change data into a topic with the same name as the table by default.

In the same terminal window run the following command to start consuming from the account topic:

kfk console-consumer --topic account \
-c my-cluster -n cockroachbank

Leave the terminal window open to see the consumed messages for the further steps.

Capturing the Change Event Data

In the SQL client terminal window, run the following command to insert a sample account data into the account table.

root@:26257/bank> INSERT INTO account (id, balance) VALUES (1, 1000), (2, 250), (3, 700);

This should create the following accounts in the CockroachDB account table:

id balance
1 1000
2 250
3 700

After inserting the data, verify that the Strimzi CLI consumer prints out the consumed data:

{"after": {"balance": 1000, "id": 1}, "updated": "1655075852039229718.0000000000"}
{"after": {"balance": 250, "id": 2}, "updated": "1655075852039229718.0000000000"}
{"after": {"balance": 700, "id": 3}, "updated": "1655075852039229718.0000000000"}

To observe some more event changes on the topic data, run the following queries on the SQL client to change the balance between accounts.

root@:26257/bank> UPDATE account SET balance=700 WHERE id=1;
root@:26257/bank> UPDATE account SET balance=600 WHERE id=2;
root@:26257/bank> UPDATE account SET balance=650 WHERE id=3;

Verify that the new changes, which CDC of CockroachDB captures, are printed out by the Strimzi CLI consumer. In the consumer’s terminal window, you should see a result as follows:

...output omitted...
{"after": {"balance": 700, "id": 1}, "updated": "1655075898837400914.0000000000"}
{"after": {"balance": 600, "id": 2}, "updated": "1655075904775509980.0000000000"}
{"after": {"balance": 650, "id": 3}, "updated": "1655075910511670876.0000000000"}

Notice the balance changes that represent a transaction history in the consumer output.

As the last step, delete the bank account to see how CockroachDB CDC captures them. Run the following SQL query in the query console of CockroachDB:

root@:26257/bank> DELETE FROM account where id <> 0;

The Strimzi CLI consumer should print the following captured events:

...output omitted...
{"after": null, "updated": "1655075949177132715.0000000000"}
{"after": null, "updated": "1655075949177132715.0000000000"}
{"after": null, "updated": "1655075949177132715.0000000000"}

Notice that the after field becomes null when the change data is a delete event.

Congratulations! You’ve successfully captured the bank account transaction changes, and streamed them as change events via Strimzi.

Conclusion

In this tutorial, you have learned how to run a Kafka cluster on OpenShift by using Strimzi, and how to create a topic by using the Strimzi CLI. Also, you have learned to install CockroachDB on OpenShift, create a table by using its SQL query interface, and configure it for using CDC. You’ve created change events on a CockroachDB table and consumed those events from the Strimzi Kafka topic by using the Strimzi CLI.

You can find the resources for this tutorial in this repository.

Continue reading >
Share

Strimzi Kafka CLI Version Update (Strimzi 0.26.1 – 0.28.0)

By Aykut Bulgu / March 26, 2022

Hi everyone. I recently updated the Strimzi Kafka CLI to the following Strimzi versions:

  • Strimzi 0.26.1 (for AMQ Streams 2.0.1) – 0.1.0a60
  • Strimzi 0.28.0 (latest) – 0.1.0a61

Note that 0.1.0a61 is the latest version and Strimzi CLI is still in alpha state.

If you are using AMQ Streams 2.0.1, then install Strimzi CLI as follows:

pip install strimzi-kafka-cli==0.1.0a60

If you are using Strimzi 0.28.0 (currently the latest);

By using pip:

pip install strimzi-kafka-cli

By using homebrew:

brew tap systemcraftsman/strimzi-kafka-cli && brew install strimzi-kafka-cli

Or you can upgrade to the latest version, which is currently 0.1.0a61;

By using pip:

pip install strimzi-kafka-cli --upgrade

By using homebrew:

brew upgrade strimzi-kafka-cli

Any updates on Strimzi Kafka CLI will be done on the latest version so the version 0.1.0a60 that is for Strimzi 0.26.1 will not be affected by these changes. There is no LTS support for now for any AMQ Streams version because the CLI is in the alpha state yet.

If you want to use a recent version of the CLI for AMQ Streams 2.0.1 (Strimzi 0.26.1), set the following environment variable to use Strimzi 0.26.1. The current latest version is compatible with Strimzi 0.26.1 as soon as you set the following environment variable.

export STRIMZI_KAFKA_CLI_STRIMZI_VERSION=0.26.1

Stay tuned for future releases. See the current workflow here and contribute if you are interested!

New to Strimzi CLI? Check out this cheat sheet:

Continue reading >
Share

Installing Strimzi CLI with Homebrew

By Aykut Bulgu / October 5, 2021

It has been a while since a friend of mine asked if Strimzi CLI can be installed with Homebrew.

I had to answer by saying “No” that time. But now my friend, we have the Homebrew installer:)

I had been thinking that using the Python package installer is enough because any OS can have Python and pip and so the CLI. After some time, I changed my mind since these package managers are just tools that people might be a bit fanatic of.

So why not start to support other package managers? Asking this question to myself, I decided to start with Homebrew because macOS (and Mac itself) is pretty popular among developers.

To install Strimzi CLI with Homebrew, first, you need to add the `tap` of the package:

brew tap systemcraftsman/strimzi-kafka-cli

Then you can run the `brew install` command for Strimzi CLI:

brew install strimzi-kafka-cli

The installer will check whether you have Python 3 on your machine or not and install it if it does not exist.

Here is a short video that I demonstrate how to install Strimzi CLI with Homebrew.

More curious about Strimzi CLI? Check out the Strimzi Kafka CLI Cheat Sheet to explore Strimzi Kafka CLI capabilities at a glance!

Continue reading >
Share

A Cheat Sheet for Strimzi Kafka CLI

TL;DR: You can download the Strimzi CLI Cheat Sheet from this link if you are curious about Strimzi CLI capabilities and want a kind of quick reference guide for it.

It is a little bit more than one year since I first announced Strimzi Kafka CLI; a command-line interface for Strimzi Kafka Operator.

It was a 4-month-old project back then and I was too excited about the project to share it before it is Beta.

Well now it is a nearly 1.5 years old project that is still in Alpha since there is still a lot to harden for the Beta roadmap but in the meanwhile added many features like creating the operator from the command-line, creating the Kafka cluster with just one command, and managing a Kafka Connect cluster and its connectors with a similar way to the traditional one.

Through this time, I published a variety of videos in the System Craftsman Youtube channel about Strimzi CLI features and how it simplifies Strimzi usage imperatively. Most importantly, shared its advantages over the declarative model by explaining its story in a presentation that I did in an internal event of Red Hat. (Planning to do it publicly as well by submitting the same talk for CFPs)

With a lot of interest in the videos and the posts that I have been sharing from here, I thought that a document that gathers all the current features of the CLI in a summarized way would be great. So I decided to create a cheat sheet for Strimzi Kafka CLI.

After a few months of work (yes, unfortunately, it took a bit long since I have a full-time job🙂), I was able to finish the cheat sheet and find a proper way to distribute it safely.

The cheat sheet has shell command examples for different features of Strimzi CLI. So if you take a quick look at inside, you will see it has 7-8 pages that have more or less the following concepts:

  • A short definition of what Strimzi Kafka Operator is
  • How Strimzi Kafka CLI works
  • How to install it
  • Managing Strimzi Kafka Operator via CLI
  • Managing Kafka clusters via CLI
  • Managing Kafka topics
  • Producing and consuming messages
  • Managing users
  • Managing ACLs
  • Managing cluster, topic and user configurations
  • Kafka Connect and connectors

You can download the Strimzi CLI Cheat Sheet from this link if you are curious about Strimzi CLI capabilities and want a kind of quick reference guide for it.

Here is a short video that I do a quick walk-through for Strimzi Kafka CLI Cheat Sheet:

Continue reading >
Share

Creating a Kafka Connect Cluster on Strimzi by using Strimzi CLI

Kafka Connect

In this example, we will create a Kafka Connect cluster, and a few connectors that consumes particular twitter topics and writes them to an elasticsearch index to be searched easily.

We are not going to deal with any custom resources of Strimzi. Instead we will use traditional .property files that are used for Kafka Connect instances and connectors, with the help of Strimzi Kafka CLI.

Prerequisites

  • A Kubernetes/OpenShift cluster that has Strimzi Kafka Operator installed.
  • A namespace called kafka and a Kafka cluster called my-cluster
  • An elasticsearch instance up and running in the same namespace. (You can use the elasticsearch.yaml file in the repository if you have the ElasticSearch Operator running.)
  • A public image registry that has a repository called demo-connect-cluster.
  • Most importantly, a Twitter Developer Account that enables you to use Twitter API for development purposes. In this example we are going to use it with one of our Kafka Connect connectors.
  • This part of the repository. Clone this repository to be able to use the scripts provided for this example.

As a recommendation create the namespace first:

$ kubectl create namespace kafka

Or you can use the new-project command if you are using OpenShift.

$ oc new-project kafka

Then install the Strimzi Operator if its not installed. You can use Strimzi CLI for this:

$ kfk operator --install -n kafka

Clone the repo if you haven't done before and cd into the example's directory.

$ git clone https://github.com/systemcraftsman/strimzi-kafka-cli.git
$ cd strimzi-kafka-cli/examples/5_connect

Create Kafka and Elasticsearch cluster by running the setup_example.sh script in the example's directory:

$ chmod +x ./scripts/setup_example.sh
$ ./scripts/setup_example.sh

This will create a Kafka cluster with 2 brokers, and an Elasticsearch cluster that's accessible through a Route.

Keep in mind that this script doesn't create the Elasticsearch operator which Elasticsearch resource that is created in this script needs. So first you will need to install the operator for Elasticsearch before running the helper script.


NOTE

If you are using Kubernetes you can create an Ingress and expose the Elasticsearch instance.

Exposing the Elasticsearch instance is not mandatory; you can access the Elasticsearch instance cluster internally.


Lastly create an empty repository in any image registry of your choice. For this example we are going to use Quay.io as our repository will be quay.io/systemcraftsman/demo-connect-cluster.

Creating a Kafka Connect Cluster with a Twitter Source Connector

For this example, to show how it is easy to create a Kafka Connect cluster with a traditional properties file, we will use an example of a well-known Kafka instructor, Stephane Maarek, who demonstrates a very basic Twitter Source Connector in one of his courses.

So let's clone the repository https://github.com/simplesteph/kafka-beginners-course.git and change directory into the kafka-connect folder in the repository.

In the repository we have this twitter.properties file which has the following config in it:

name=TwitterSourceDemo
tasks.max=1
connector.class=com.github.jcustenborder.kafka.connect.twitter.TwitterSourceConnector

# Set these required values
process.deletes=false
filter.keywords=bitcoin
kafka.status.topic=twitter_status_connect
kafka.delete.topic=twitter_deletes_connect
# put your own credentials here - don't share with anyone
twitter.oauth.consumerKey=
twitter.oauth.consumerSecret=
twitter.oauth.accessToken=
twitter.oauth.accessTokenSecret=

This connector get the tweets statuses or deletions and saves them into the twitter_status_connect or twitter_deletes_connect depending on the action. The filter.keywords defines the keywords to be filtered for the returned tweets. In this case it is set as bitcoin, so this will consume every tweet that has bitcoin and put it in the relevant topics.

Now let's make a few changes on this file regarding the content and restrictions that Strimzi has for topic names.

Copy the twitter.properties file and save it as twitter_connector.properties which you will be editing.

In the new file change the twitter_status_connect to twitter-status-connect which Strimzi will complain about since it is not a good name for a topic. Normally Apache Kafka returns a warning about this but allows this underscore(_) convention. Since Strimzi uses custom resources for managing Kafka resources, it is not a good practice to use underscores in the topic names, or in any other custom resource of Strimzi.

Also change the twitter_deletes_connect to twitter-deletes-connect and the connector name to twitter-source-demo for a common convention.

Enter your Twitter OAuth keys which you can get from your Twitter Developer Account. For the creation of a Twitter Developer Account, Stephane explains this perfectly in his Kafka For Beginners course on Udemy. So I recommend you to take a look at both the course and the twitter setup that is explained.

Finally, change the bitcoin filter to kafka for our demo (Or you can change it to anything that you want to see the tweets of).

The final connector configuration file should look like this:

name=twitter-source-demo
tasks.max=1
connector.class=com.github.jcustenborder.kafka.connect.twitter.TwitterSourceConnector

# Set these required values
process.deletes=false
filter.keywords=kafka
kafka.status.topic=twitter-status-connect
kafka.delete.topic=twitter-deletes-connect
# put your own credentials here - don't share with anyone
twitter.oauth.consumerKey=_YOUR_CONSUMER_KEY_
twitter.oauth.consumerSecret=_YOUR_CONSUMER_SECRET_
twitter.oauth.accessToken=_YOUR_ACCESS_TOKEN_
twitter.oauth.accessTokenSecret=_YOUR_ACCESS_TOKEN_SECRET_

Notice how little we changed (actually just the names) in order to use it in the Strimzi Kafka Connect cluster.

Because we are going to need the twitter-status-connect and twitter-deletes-connect topics, let's create them upfront and continue our configuration. You must have remembered our kfk topics --create commands topics creation with Strimzi Kafka CLI:

$ kfk topics --create --topic twitter-status-connect --partitions 3 --replication-factor 1 -c my-cluster -n kafka
$ kfk topics --create --topic twitter-deletes-connect --partitions 3 --replication-factor 1 -c my-cluster -n kafka

Now let's continue with our Connect cluster's creation.

In the same repository we have this connect-standalone.properties file which has the following config in it:

...output omitted...
bootstrap.servers=localhost:9092

...output omitted...
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter

...output omitted...
key.converter.schemas.enable=true
value.converter.schemas.enable=true

offset.storage.file.filename=/tmp/connect.offsets
# Flush much faster than normal, which is useful for testing/debugging
offset.flush.interval.ms=10000

...output omitted...
plugin.path=connectors

Kafka Connect normally has this plugin.path key which has the all connector binaries to be used for any connector created for that Connect cluster. In our case, for Strimzi, it will be a bit different because we are going to create our connect cluster in a Kubernetes/OpenShift environment, so we should either create an image locally, or make Strimzi create the connect image for us. We will use the second option, which is fairly a new feature of Strimzi.

Only thing we have to do, instead of defining a path, we will define a set of url that has the different connector resources. So let's copy the file that Stephane created for us and save it as connect.properties since Kafka Connect works in the distributed mode in Strimzi (there is no standalone mode in short).

In the connect.properties file change the plugin.path with plugin.url and set the following source url to it:

plugin.url=https://github.com/jcustenborder/kafka-connect-twitter/releases/download/0.2.26/kafka-connect-twitter-0.2.26.tar.gz

By comparing to the original repository, you can see in the connectors folder there are a bunch of jar files that Twitter Source Connector uses. The url that you set above has the same resources archived. Strimzi extracts them while building the Connect image in the Kubernetes/OpenShift cluster.

Speaking of the image, we have to set an image, actually an image repository path, that Strimzi can push the built image into. This can be either an internal registry of yours, or a public one like Docker Hub or Quay. In this example we will use Quay and we should set the image URL like the following:

image=quay.io/systemcraftsman/demo-connect-cluster:latest

Here you can set the repository URL of your choice instead of quay.io/systemcraftsman/demo-connect-cluster:latest. As a prerequisite, you have to create this repository and make the credentials ready for the image push for Strimzi.

Apart from the plugin.path, we can do a few changes like changing the offset storage to a topic instead of a file and disabling the key/value converter schemas because we will just barely need to see the data itself; we don't need the JSON schemas.

Lastly change the bootstrap.servers value to my-cluster-kafka-bootstrap:9092, as my-cluster-kafka-bootstrap is the my-cluster Kafka cluster's Kubernetes internal host name that is provided by a Kubernetes Service.

So the final connect.properties file should look like this:

...output omitted...

bootstrap.servers=my-cluster-kafka-bootstrap:9092

# The converters specify the format of data in Kafka and how to translate it into Connect data. Every Connect user will
# need to configure these based on the format they want their data in when loaded from or stored into Kafka
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
# Converter-specific settings can be passed in by prefixing the Converter's setting with the converter we want to apply
# it to
key.converter.schemas.enable=false
value.converter.schemas.enable=false

offset.storage.topic=connect-cluster-offsets
config.storage.topic=connect-cluster-configs
status.storage.topic=connect-cluster-status
config.storage.replication.factor=1
offset.storage.replication.factor=1
status.storage.replication.factor=1

...output omitted...

image=quay.io/systemcraftsman/demo-connect-cluster:latest
plugin.url=https://github.com/jcustenborder/kafka-connect-twitter/releases/download/0.2.26/kafka-connect-twitter-0.2.26.tar.gz

Again notice how the changes are small to make it compatible for a Strimzi Kafka Connect cluster. Now lets run the Kafka Connect cluster in a way that we used to do with the traditional CLI of Kafka.

In order to start a standalone Kafka Connect cluster traditionally some must be familiar with a command like the following:

$ ./bin/connect-standalone.sh connect.properties connector.properties

The command syntax for Strimzi Kafka CLI is the same. This means you can create a Connect cluster along with one or more connectors by providing their config properties. The only difference is, Strimzi runs the Connect cluster in the distributed mode.

Run the following command to create to create a connect cluster called my-connect-cluster and a connector called twitter-source-demo. Don't forget to replace your image registry user with _YOUR_IMAGE_REGISTRY_USER_.

$ kfk connect clusters --create --cluster my-connect-cluster --replicas 1 -n kafka connect.properties twitter_connector.properties -u _YOUR_IMAGE_REGISTRY_USER_ -y

IMPORTANT

You can also create the cluster with a more controlled way; by not passing the -y flag. Without the -y flag, Strimzi Kafka CLI shows you the resource YAML of the Kafka Connect cluster in an editor, and you can modify or just save the resource before the creation. In this example we skip this part with -y flag.


You should be prompted for the registry password. Enter the password and observe the CLI response as follows:

secret/my-connect-cluster-push-secret created
kafkaconnect.kafka.strimzi.io/my-connect-cluster created
kafkaconnector.kafka.strimzi.io/twitter-source-demo created

IMPORTANT

Be careful while entering that because there is no mechanism that checks this password in Strimzi Kafka CLI, so if the password is wrong, simply the Connect image will be built sucessfully, but Strimzi won't be able to push it to the registry you specified before.

In case of any problem just delete the Connect cluster with the following command and create it again:

$ kfk connect clusters --delete --cluster my-connect-cluster -n kafka -y

Or you can delete/create the push secret that is created if you are experienced enough.


Now you can check the pods and wait till the Connect cluster pod runs without a problem.

$ watch kubectl get pods -n kafka
...output omitted...
my-connect-cluster-connect-8444df69c9-x7xf6   1/1     Running     0          3m43s
my-connect-cluster-connect-build-1-build      0/1     Completed   0          6m47s
...output omitted...

If everything is ok with the connect cluster, now you should see some messages in one of the topics we created before running the Connect cluster. Let's consume messages from twitter-status-connect topic to see if our Twitter Source Connector works.

$ kfk console-consumer --topic twitter-status-connect -c my-cluster -n kafka
...output omitted...
{"CreatedAt":1624542267000,"Id":1408058441428439047,"Text":"@Ch1pmaster @KAFKA_Dev Of het is gewoon het zoveelste boelsjit verhaal van Bauke...
...output omitted...

Observe that in the console tweets appear one by one while they are created in the twitter-status-connect topic and consumed by the consumer.

As you can see we took a couple of traditional config files from one of the most loved Kafka instructor's samples and with just a few changes on the configuration, we could create our Kafka Connect cluster along with a Twitter Source connector easily.

Now let's take a step forward and try another thing. What about putting these tweets in an elasticsearch index and make them searchable?

Altering the Kafka Connect Cluster

In order to get the tweets from the twitter-status-connect topic and index them in Elasticsearch we need to use a connector that does this for us.

Camel Elasticsearch REST Kafka Sink Connector is the connector that will do the magic for us.

First we need to add the relevant plugin resources of Camel Elasticsearch REST Sink Connector in our current connect.properties file that configures our Kafka Connect cluster.

Add the URL of the connector like the following in the connect.properties file:

...output omitted...
plugin.url=https://github.com/jcustenborder/kafka-connect-twitter/releases/download/0.2.26/kafka-connect-twitter-0.2.26.tar.gz,https://repo.maven.apache.org/maven2/org/apache/camel/kafkaconnector/camel-elasticsearch-rest-kafka-connector/0.10.0/camel-elasticsearch-rest-kafka-connector-0.10.0-package.tar.gz
...output omitted...

Now run the kfk connect clusters command this time with --alter flag.

$ kfk connect clusters --alter --cluster my-connect-cluster -n kafka connect.properties
kafkaconnect.kafka.strimzi.io/my-connect-cluster replaced

Observe the connector is being build again by watching the pods.

$ watch kubectl get pods -n kafka

Wait until the build finishes, and the connector pod is up and running again.

...output omitted...
my-connect-cluster-connect-7b575b6cf9-rdmbt   1/1     Running     0          111s
...output omitted...
my-connect-cluster-connect-build-2-build      0/1     Completed   0          2m37s

Because we have a running Connect cluster ready for a Camel Elasticsearch REST Sink Connector, we can create our connector now, this time using the kfk connect connectors command.

Creating a Camel Elasticsearch REST Sink Connector

Create a file called camel_es_connector.properties and paste the following in it.

name=camel-elasticsearch-sink-demo
tasks.max=1
connector.class=org.apache.camel.kafkaconnector.elasticsearchrest.CamelElasticsearchrestSinkConnector

value.converter=org.apache.kafka.connect.storage.StringConverter

topics=twitter-status-connect
camel.sink.endpoint.hostAddresses=elasticsearch-es-http:9200
camel.sink.endpoint.indexName=tweets
camel.sink.endpoint.operation=Index
camel.sink.path.clusterName=elasticsearch
errors.tolerance=all
errors.log.enable=true
errors.log.include.messages=true

Observe that our connector's name is camel-elasticsearch-sink-demo and we use the CamelElasticsearchrestSinkConnector class to read the tweets from twitter-status-connect topic.

Properties starting with camel.sink. defines the connector specific properties. With these properties the connector will create an index called tweets in the Elasticsearch cluster which is accesible from elasticsearch-es-http:9200 host and port.

For more details for this connector, please visit the connector's configuration page link that we provided above.

Creating a connector is very simple. If you defined a topic or another object of Strimzi via Strimzi Kafka CLI before, you will notice the syntax is pretty much the same.

Run the following command to create the connector for Camel Elasticsearch REST Sink:

$ kfk connect connectors --create -c my-connect-cluster -n kafka camel_es_connector.properties
kafkaconnector.kafka.strimzi.io/camel-elasticsearch-sink-demo created

You can list the created connectors so far:

$ kfk connect connectors --list -c my-connect-cluster -n kafka
NAME                            CLUSTER              CONNECTOR CLASS                                                                         MAX TASKS   READY
twitter-source-demo             my-connect-cluster   com.github.jcustenborder.kafka.connect.twitter.TwitterSourceConnector                   1           1
camel-elasticsearch-sink-demo   my-connect-cluster   org.apache.camel.kafkaconnector.elasticsearchrest.CamelElasticsearchrestSinkConnector   1           1

After the resource created run the following curl command in the watch mode to observe how the indexed values increases per tweet consumption. Change the _ELASTIC_EXTERNAL_URL_ with your Route or Ingress URL of the Elasticsearch cluster you created as a prerequisite.

$ watch "curl -s http://_ELASTIC_EXTERNAL_URL_/tweets/_search | jq -r '.hits.total.value'"

In another terminal window you can run the console consumer again to see both the Twitter Source connector and the Camel Elasticsearch Sink connector in action:

tweets_flowing

In a browser or with curl, call the following URL for searching Apache word in the tweet texts.

$ curl -s http://_ELASTIC_EXTERNAL_URL_/tweets/_search?q=Text:Apache
{"took":3,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":3,"relation":"eq"},"max_score":5.769906,"hits":[{"_index":"tweets","_type":"_doc","_id":"bm6aPnoBRxta4q47oss0","_score":5.769906,"_source":{"CreatedAt":1624542084000,"Id":1408057673577345026,"Text":"RT @KCUserGroups: June 29: Kansas City Apache Kafka® Meetup by Confluent - Testing with AsyncAPI for Apache Kafka: Brokering the Complexity…","Source":"<a href=\"http://twitter.com/download/android\" rel=\"nofollow\">Twitter for Android</a>","Truncated":false,"InReplyToStatusId":-1,"InReplyToUserId":-1,"InReplyToScreenName":null,"GeoLocation":null,"Place":null,"Favorited":false,"Retweeted":false,"FavoriteCount":0,"User":{"Id":87489271,"Name":"Fran Méndez","ScreenName":"fmvilas","Location":"Badajoz, España","Description":"Founder of @AsyncAPISpec. Director of Engineering at @getpostman.\n\nAtheist, feminist, proud husband of @e_morcillo, and father of Ada & 2 cats \uD83D\uDC31\uD83D\uDC08 he/him","ContributorsEnabled":false,"ProfileImageURL":"http://pbs.twimg.com/profile_images/1373387614238179328/cB1gp6Lh_normal.jpg","BiggerProfileImageURL":"http://pbs.twimg.com/profile_images/1373387614238179328/cB1gp6Lh_bigger.jpg","MiniProfileImageURL":"http://pbs.twimg.com/profile_images/1373387614238179328/cB1gp6Lh_mini.jpg","OriginalProfileImageURL":"http://pbs.twimg.com/profile_images/1373387614238179328/cB1gp6Lh.jpg","ProfileImageURLHttps":"https://pbs.twimg.com/profile_images/1373387614238179328/cB1gp6Lh_normal.jpg","BiggerProfileImageURLHttps":"https://pbs.twimg.com/profile_images/1373387614238179328/cB1gp6Lh_bigger.jpg","MiniProfileImageURLHttps":"https://pbs.twimg.com/profile_images/1373387614238179328/cB1gp6Lh_mini.jpg","OriginalProfileImageURLHttps":"https://pbs.twimg.com/profile_images/1373387614238179328/cB1gp6Lh.jpg","DefaultProfileImage":false,"URL":"http://www.fmvilas.com","Protected":false,"FollowersCount":1983,"ProfileBackgroundColor":"000000","ProfileTextColor":"000000","ProfileLinkColor":"1B95E0","ProfileSidebarFillColor":"000000","ProfileSidebarBorderColor":"000000","ProfileUseBackgroundImage":false,"DefaultProfile":false,"ShowAllInlineMedia":false,"FriendsCount":3197,
...output omitted...

Cool! We hit some Apache Kafka tweets with our Apache search in Twitter tweets related to kafka. How about yours? If you don't hit anything you can do the search with any word of your choice.

Since we are almost done with our example let's delete the resources one by one to observe how Strimzi Kafka CLI works with the deletion of the Kafka Connect resources.

Deleting Connectors and the Kafka Connect Cluster

First let's delete our connectors one by one:

$ kfk connect connectors --delete --connector twitter-source-demo -c my-connect-cluster -n kafka
kafkaconnector.kafka.strimzi.io "twitter-source-demo" deleted
$ kfk connect connectors --delete --connector camel-elasticsearch-sink-demo -c my-connect-cluster -n kafka
kafkaconnector.kafka.strimzi.io "camel-elasticsearch-sink-demo" deleted

Observe no more tweets are produced in the twitter-status-connect topic and no more data is indexed in Elasticsearch anymore.

Now we can also delete the my-connect-cluster Kafka Connect cluster. Notice that it is pretty much the same with the Kafka cluster deletion syntax of Strimzi CLI.

$ kfk connect clusters --delete --cluster my-connect-cluster -n kafka -y

This command will both delete the KafkaConnect resource and the push secret that is created for the Connect image.

kafkaconnect.kafka.strimzi.io "my-connect-cluster" deleted
secret "my-connect-cluster-push-secret" deleted

Check the Connect cluster pod is terminated by the Strimzi operator:

$ kubectl get pods -n kafka
NAME                                          READY   STATUS    RESTARTS   AGE
elastic-operator-84774b4d49-v2lbr             1/1     Running   0          4h9m
elasticsearch-es-default-0                    1/1     Running   0          4h8m
my-cluster-entity-operator-5c84b64ddf-22t9p   3/3     Running   0          4h8m
my-cluster-kafka-0                            1/1     Running   0          4h8m
my-cluster-kafka-1                            1/1     Running   0          4h8m
my-cluster-zookeeper-0                        1/1     Running   0          4h8m

Congratulations!

In this example we are able to create a Kafka Connect cluster along with a Twitter Source connector with Strimzi Kafka CLI, to consume the tweets from Twitter and write them to one of our topics that we defined in the configuration. We also altered the Kafka Connect Cluster and added new plugin resources for Camel Elasticsearch REST Sink connector, to write our tweets from the relevant topic to an Elasticsearch index with a single --alter command of Strimzi Kafka CLI. This made our consumed tweets searchable, so that we could search for the word Apache in our tweets Elasticsearch index. After finishing the example, we cleared up our resources by deleting them easily with the CLI.

Access the repo of this post from here: https://github.com/systemcraftsman/strimzi-kafka-cli/tree/master/examples/5_connect

If you are interested in more, check out the video that I create a Kafka Connect cluster and the relevant connectors in the article, using Strimzi Kafka CLI:

Continue reading >
Share

A Strimzi Kafka Quickstart for Quarkus with Strimzi CLI

Quarkus Strimzi Demo (by using Strimzi Kafka CLI)

This project illustrates how you can interact with Apache Kafka on Kubernetes (Strimzi) using MicroProfile Reactive Messaging.

Kafka cluster (on OpenShift)

First you need a Kafka cluster on your OpenShift.

Create a namespace first:

 oc new-project kafka-quickstart

Then you will need Strimzi Kafka CLI by running the following command (you will need Python3 and pip):

 sudo pip install strimzi-kafka-cli

See here for other details about Strimzi Kafka CLI.

After installing Strimzi Kafka CLI run the following command to install the operator on kafka-quickstart namespace:

 kfk operator --install -n kafka-quickstart

When the operator is ready to serve, run the following command in order to create a Kafka cluster:

 kfk clusters --create --cluster my-cluster -n kafka-quickstart

A vim interface will pop-up. If you like you can change the broker and zookeeper replicas to 1 but I suggest you to leave them as is if your Kubernetes cluster have enough resources. Save the cluster configuration file and respond Yes to make Strimzi CLI apply the changes.

Wait for the 3 broker and 3 zookeeper pods running in ready state in your cluster:

 oc get pods -n kafka-quickstart -w

When all pods are ready, create your prices topic to be used by the application:

 kfk topics --create --topic prices --partitions 10 --replication-factor 2 -c my-cluster -n kafka-quickstart

Check your topic is created successfully by describing it natively:

kfk topics --describe --topic prices -c my-cluster -n kafka-quickstart --native

Deploy the application

The application can be deployed to OpenShift using:

 ./mvnw clean package -DskipTests

This will take a while since the s2i build will run before the deployment. Be sure the application's pod is running in ready state in the end. Run the following command to get the URL of the Prices page:

echo http://$(oc get routes -n kafka-quickstart -o json | jq -r '.items[0].spec.host')/prices.html 

Copy the URL to your browser, and you should see a fluctuating price.

Anatomy

In addition to the prices.html page, the application is composed by 3 components:

  • PriceGenerator
  • PriceConverter
  • PriceResource

We generate (random) prices in PriceGenerator. These prices are written in prices Kafka topic that we recently created. PriceConverter reads from the prices Kafka topic and apply some magic conversion to the price. The result is sent to an in-memory stream consumed by a JAX-RS resource PriceResource. The data is sent to a browser using server-sent events.

The interaction with Kafka is managed by MicroProfile Reactive Messaging. The configuration is located in the application configuration.

Running and deploying in native

You can compile the application into a native binary using:

mvn clean install -Pnative

or deploy with:

 ./mvnw clean package -Pnative -DskipTests

This demo is based on the following resources:

Visit the following link to clone this demo:

https://github.com/systemcraftsman/quarkus-strimzi-cli-demos

Continue reading >
Share

Configuring Kafka Topics, Users and Brokers on Strimzi using Strimzi Kafka CLI

Topic, User and Broker Configuration

Strimzi Kafka CLI enables users to describe, create, delete configurations of topics, users and brokers like you did with native Apache Kafka commands.

While kfk configs command can be used to change the configuration of these three entities, one can change relevant entities' configuration by using the following as well:

  • kfk topics --config/--delete-config for adding and deleting configurations to topics.

  • kfk users --quota/--delete-quota for managing quotas as a part of the configuration of it.

  • kfk clusters --config/--delete-config for adding and deleting configurations to all brokers.

In this example we will show you to do the configuration by using kfk configs only but will mention about the options above. So let's start with topic configuration.

Topic Configuration

Considering we already have a Kafka cluster called my-cluster on our namespace called kafka, let's create a topic on it called my-topic:

kfk topics --create --topic my-topic --partitions 12 --replication-factor 3 -c my-cluster -n kafka

IMPORTANT

If you don't have any Kafka cluster that is created on your OpenShift/Kubernetes, you can use the following command:

kfk clusters --create --cluster my-cluster -n kafka

You can easily create an operator on the current OpenShift/Kubernetes before creating a Kafka cluster if you don't have one:

kfk operator --install -n kafka

First let's see what pre-defined configurations we have on our topic:

kfk configs --describe --entity-type topics --entity-name my-topic -c my-cluster -n kafka --native

Since we are running our config --describe command with --native flag, we can see all the default dynamic configurations for the topic:

Dynamic configs for topic my-topic are:
  segment.bytes=1073741824 sensitive=false synonyms={DYNAMIC_TOPIC_CONFIG:segment.bytes=1073741824, DEFAULT_CONFIG:log.segment.bytes=1073741824}
  retention.ms=7200000 sensitive=false synonyms={DYNAMIC_TOPIC_CONFIG:retention.ms=7200000}

INFO

Additionally you can describe all of the topic configurations natively on current cluster. To do this, just remove the entity-name option:

kfk configs --describe --entity-type topics -c my-cluster -n kafka --native

We can also describe the Topic custom resource itself by removing the --native flag:

kfk configs --describe --entity-type topics --entity-name my-topic -c my-cluster -n kafka
...
Spec:
  Config:
    retention.ms:   7200000
    segment.bytes:  1073741824
...

Now let's add a configuration like min.insync.replicas, which configures the sync replica count through the broker, between the leaders and followers. In order to add a configuration you must use --alter and for each config to be add --add-config following the kfk config command:

kfk configs --alter --add-config min.insync.replicas=3 --entity-type topics --entity-name my-topic -c my-cluster -n kafka

You should see a message like this:

kafkatopic.kafka.strimzi.io/my-topic configured

Alternatively you can set the topic configuration by using kfk topics with --config option:

kfk topics --alter --topic my-topic --config min.insync.replicas=3 -c my-cluster -n kafka

In order to add two configs -let's say that we wanted to add cleanup.policy=compact configuration along with min.insync.replicas- run a command like following:

kfk configs --alter --add-config 'min.insync.replicas=3,cleanup.policy=compact' --entity-type topics --entity-name my-topic -c my-cluster -n kafka

or

kfk topics --alter --topic my-topic --config min.insync.replicas=3 --config cleanup.policy=compact -c my-cluster -n kafka

After setting the configurations in order to see the changes, use the --describe flag like we did before:

kfk configs --describe --entity-type topics --entity-name my-topic -c my-cluster -n kafka --native

Output is:

Dynamic configs for topic my-topic are:
  min.insync.replicas=3 sensitive=false synonyms={DYNAMIC_TOPIC_CONFIG:min.insync.replicas=3, DEFAULT_CONFIG:min.insync.replicas=1}
  cleanup.policy=compact sensitive=false synonyms={DYNAMIC_TOPIC_CONFIG:cleanup.policy=compact, DEFAULT_CONFIG:log.cleanup.policy=delete}
  segment.bytes=1073741824 sensitive=false synonyms={DYNAMIC_TOPIC_CONFIG:segment.bytes=1073741824, DEFAULT_CONFIG:log.segment.bytes=1073741824}
  retention.ms=7200000 sensitive=false synonyms={DYNAMIC_TOPIC_CONFIG:retention.ms=7200000}

In order to see the added configuration as a Strimzi resource run the same command without --native option:

kfk configs --describe --entity-type topics --entity-name my-topic -c my-cluster -n kafka
...
  Config:
    cleanup.policy:       compact
    min.insync.replicas:  3
    retention.ms:         7200000
    segment.bytes:        1073741824
...

Like adding a configuration, deleting a configuration is very easy. You can remove all the configurations that you've just set with a single command:

kfk configs --alter --delete-config 'min.insync.replicas,cleanup.policy' --entity-type topics --entity-name my-topic -c my-cluster -n kafka

or you can use:

kfk topics --alter --topic my-topic --delete-config min.insync.replicas --delete-config cleanup.policy -c my-cluster -n kafka

When you run the describe command again you will see that the relevant configurations are removed:

kfk configs --describe --entity-type topics --entity-name my-topic -c my-cluster -n kafka --native
Dynamic configs for topic my-topic are:
  segment.bytes=1073741824 sensitive=false synonyms={DYNAMIC_TOPIC_CONFIG:segment.bytes=1073741824, DEFAULT_CONFIG:log.segment.bytes=1073741824}
  retention.ms=7200000 sensitive=false synonyms={DYNAMIC_TOPIC_CONFIG:retention.ms=7200000}

As you can see we could easily manipulate the topic configurations almost like the native shell executables of Apache Kafka. Now let's see how it is done for user configuration.

User Configuration

For the user configuration let's first create a user called my-user:

kfk users --create --user my-user --authentication-type tls -n kafka -c my-cluster

After creating the user, let's add two configurations as quota configurations like request_percentage=55 and consumer_byte_rate=2097152.

kfk configs --alter --add-config 'request_percentage=55,consumer_byte_rate=2097152' --entity-type users --entity-name my-user -c my-cluster -n kafka

Alternatively you can set the user quota configuration by using kfk users with --quota option:

kfk users --alter --user my-user --quota request_percentage=55 --quota consumer_byte_rate=2097152 -c my-cluster -n kafka

IMPORTANT

In traditional kafka-configs.sh command there are actually 5 configurations, 3 of which are quota related ones:

consumer_byte_rate
producer_byte_rate
request_percentage

and the other 2 is for authentication type:

SCRAM-SHA-256
SCRAM-SHA-512

While these two configurations are also handled by kafka-configs.sh in traditional Kafka usage, in Strimzi CLI they are configured by altering the cluster by running the kfk clusters --alter command and altering the user by using the kfk users --alter command for adding the relevant authentication type. So kfk configs command will not be used for these two configurations since it's not supported.


Now let's take a look at the configurations we just set:

kfk configs --describe --entity-type users --entity-name my-user -c my-cluster -n kafka --native
Configs for user-principal 'CN=my-user' are consumer_byte_rate=2097152.0, request_percentage=55.0

INFO

Additionally you can describe all of the user configurations natively on current cluster. To do this, just remove the entity-name option:

kfk configs --describe --entity-type users -c my-cluster -n kafka --native

You can also see the changes in the Kubernetes native description:

kfk configs --describe --entity-type users --entity-name my-user -c my-cluster -n kafka
...
Spec:
...
  Quotas:
    Consumer Byte Rate:  2097152
    Request Percentage:  55
...

Deletion of the configurations is almost the same as deleting the topic configurations:

kfk configs --alter --delete-config 'request_percentage,consumer_byte_rate' --entity-type users --entity-name my-user -c my-cluster -n kafka

or

kfk users --alter --user my-user --delete-quota request_percentage=55 --delete-quota consumer_byte_rate=2097152 -c my-cluster -n kafka

You can see that empty response returning since there is no configuration anymore after the deletion:

kfk configs --describe --entity-type users --entity-name my-user -c my-cluster -n kafka --native

So we could easily update/create/delete the user configurations for Strimzi, almost like the native shell executables of Apache Kafka. Now let's take our final step to see how it is done for broker configuration.

Broker Configuration

Adding configurations either as dynamic ones or static ones are as easy as it is for the topics and users. For both configuration types, Strimzi takes care about it itself by rolling update the brokers for the static configurations and reflecting directly the dynamic configurations.

Here is a way to add a static configuration that will be reflected after the rolling update of the brokers:

kfk configs --alter --add-config log.retention.hours=168 --entity-type brokers --entity-name my-cluster -c my-cluster -n kafka

or alternatively using the kfk clusters command:

kfk clusters --alter --cluster my-cluster --config log.retention.hours=168 -n kafka

IMPORTANT

Unlike the native kafka-configs.sh command, for the entity-name, the Kafka cluster name should be set rather than the broker ids.


kfk configs --describe --entity-type brokers --entity-name my-cluster -c my-cluster -n kafka
...
  Kafka:
    Config:
      log.message.format.version:                2.6
      log.retention.hours:                       168
      offsets.topic.replication.factor:          3
      transaction.state.log.min.isr:             2
      transaction.state.log.replication.factor:  3
...

You can describe the cluster config Kafka natively like the following:

kfk configs --describe --entity-type brokers -c my-cluster -n kafka --native
Dynamic configs for broker 0 are:
Dynamic configs for broker 1 are:
Dynamic configs for broker 2 are:
Default configs for brokers in the cluster are:

All user provided configs for brokers in the cluster are:
log.message.format.version=2.6
log.retention.hours=168
offsets.topic.replication.factor=3
transaction.state.log.min.isr=2
transaction.state.log.replication.factor=3

IMPORTANT

Note that using describe with native flag doesn't require any entity-name option since it fetches the cluster-wide broker configuration. For a specific broker configuration one can set entity-name as the broker id which will only show the first broker's configuration which will be totally the same with the cluster-wide one.


Now let's add a dynamic configuration in order to see it while describing with native flag. We will change log.cleaner.threads configuration which is responsible for controlling the background threads that do log compaction and is 1 one by default.

kfk configs --alter --add-config log.cleaner.threads=2 --entity-type brokers --entity-name my-cluster -c my-cluster -n kafka

or

kfk clusters --alter --cluster my-cluster --config log.cleaner.threads=2 -n kafka

While describing it via Strimzi custom resources will return you the list again:

kfk configs --describe --entity-type brokers --entity-name my-cluster -c my-cluster -n kafka
...
  Kafka:
    Config:
      log.cleaner.threads:                       2
      log.message.format.version:                2.6
      log.retention.hours:                       168
      offsets.topic.replication.factor:          3
      transaction.state.log.min.isr:             2
      transaction.state.log.replication.factor:  3
...

Describing it with native flag will give more details about configurations' being dynamic or not:

kfk configs --describe --entity-type brokers -c my-cluster -n kafka --native
Dynamic configs for broker 0 are:
  log.cleaner.threads=2 sensitive=false synonyms={DYNAMIC_BROKER_CONFIG:log.cleaner.threads=2, DEFAULT_CONFIG:log.cleaner.threads=1}
Dynamic configs for broker 1 are:
  log.cleaner.threads=2 sensitive=false synonyms={DYNAMIC_BROKER_CONFIG:log.cleaner.threads=2, DEFAULT_CONFIG:log.cleaner.threads=1}
Dynamic configs for broker 2 are:
  log.cleaner.threads=2 sensitive=false synonyms={DYNAMIC_BROKER_CONFIG:log.cleaner.threads=2, DEFAULT_CONFIG:log.cleaner.threads=1}
Default configs for brokers in the cluster are:

All user provided configs for brokers in the cluster are:
log.cleaner.threads=2
log.message.format.version=2.6
log.retention.hours=168
offsets.topic.replication.factor=3
transaction.state.log.min.isr=2
transaction.state.log.replication.factor=3

Deleting the configurations are exactly the same with the topics and users:

kfk configs --alter --delete-config 'log.retention.hours,log.cleaner.threads' --entity-type brokers --entity-name my-cluster -c my-cluster -n kafka

or use the following command:

kfk clusters --alter --cluster my-cluster --delete-config log.retention.hours --delete-config log.cleaner.threads -n kafka

You can see only initial configurations after the deletion:

kfk configs --describe --entity-type brokers -c my-cluster -n kafka --native
Dynamic configs for broker 0 are:
Dynamic configs for broker 1 are:
Dynamic configs for broker 2 are:
Default configs for brokers in the cluster are:

All user provided configs for brokers in the cluster are:
log.message.format.version=2.6
offsets.topic.replication.factor=3
transaction.state.log.min.isr=2
transaction.state.log.replication.factor=3

So that's all!

We are able to create, update, delete the configurations of topics, users and Kafka cluster itself and describe the changed configurations both Kubernetes and Kafka natively using Strimzi Kafka CLI.

If you are interested more, you can have a look at the short video that I demonstrate the Apache Kafka configuration on Strimzi using Strimzi Kafka CLI:

Continue reading >
Share
Page 1 of 2