Dataset Creation Issue with Arcadia Instant for KSQL

I did a quick test with Arcadia Instant 4.5 and the Confluent Platform Demo 5.1.0 (better known as cp-demo). Instead of showing only the available streams and tables that have been defined, both the new dataset creator and the connection explorer show all of the available topics on the Kafka cluster. This makes it difficult to navigate especially since internal topics are not filtered out and if the topic name doesn’t match the stream or table name I will likely get an error because it will try to use the topic name to fetch the metadata from KSQL.

The behavior I would expect would to either just show the available steams and tables or have a separate topic explorer and provide a form to create a new stream or table.

I tested with a some other streams backed by topics with the same name and everything worked as expected. Seeing the visuals update in real time is very cool and provides a nice face to stream processing.


Thanks for the feedback and suggested improvements. If you have any snapshots of errors while exploring topics please share them (if possible) and we’ll pass them along to our engineering team.

Thank you,
Tadd Wood

@Patrick_Druley Is the new dataset creation modal not sufficient? It does have a hierarchy of Topics -> Tables/Streams to make it easy find what you are looking for.

Also, the Dataset Source gives you option to create your own stream

Also, there is an “ALL” in Connection Explorer which narrows down all the streams/tables. Did you see that?

Here is the error message:



Failed to load dataset table column detail Less

b’{"@type":“statement_error”,“error_code”:40001,“message”:“Could not find STREAM/TABLE ‘PARSED’ in the Metastore”,“stackTrace”:[“”,“”,“”,“sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)”,“sun.reflect.NativeMethodAccessorImpl.invoke(”,“sun.reflect.DelegatingMethodAccessorImpl.invoke(”,“java.lang.reflect.Method.invoke(”,“org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(”,“org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$”,“org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(”,“org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(”,“org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(”,“org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(”,“org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(”,“org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(”,“org.glassfish.jersey.server.ServerRuntime$”,“org.glassfish.jersey.internal.Errors$”,“org.glassfish.jersey.internal.Errors$”,“org.glassfish.jersey.internal.Errors.process(”,“org.glassfish.jersey.internal.Errors.process(”,“org.glassfish.jersey.internal.Errors.process(”,“org.glassfish.jersey.process.internal.RequestScope.runInScope(”,“org.glassfish.jersey.server.ServerRuntime.process(”,“org.glassfish.jersey.server.ApplicationHandler.handle(”,“org.glassfish.jersey.servlet.WebComponent.serviceImpl(”,“org.glassfish.jersey.servlet.ServletContainer.serviceImpl(”,“org.glassfish.jersey.servlet.ServletContainer.doFilter(”,“org.glassfish.jersey.servlet.ServletContainer.doFilter(”,“org.glassfish.jersey.servlet.ServletContainer.doFilter(”,“org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(”,“org.eclipse.jetty.servlet.ServletHandler.doHandle(”,“org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(”,“org.eclipse.jetty.server.session.SessionHandler.doHandle(”,“org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(”,“org.eclipse.jetty.server.handler.ContextHandler.doHandle(”,“org.eclipse.jetty.server.handler.ScopedHandler.nextScope(”,“org.eclipse.jetty.servlet.ServletHandler.doScope(”,“org.eclipse.jetty.server.session.SessionHandler.doScope(”,“org.eclipse.jetty.server.handler.ScopedHandler.nextScope(”,“org.eclipse.jetty.server.handler.ContextHandler.doScope(”,“org.eclipse.jetty.server.handler.ScopedHandler.handle(”,“org.eclipse.jetty.server.handler.HandlerCollection.handle(”,“org.eclipse.jetty.server.handler.StatisticsHandler.handle(”,“org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(”,“org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(”,“org.eclipse.jetty.server.handler.HandlerWrapper.handle(”,“org.eclipse.jetty.server.Server.handle(”,“org.eclipse.jetty.server.HttpChannel.handle(”,“org.eclipse.jetty.server.HttpConnection.onFillable(”,“$ReadCallback.succeeded(”,“”,“$”,“org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(”,“org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(”,“org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(”,“”,“org.eclipse.jetty.util.thread.ReservedThreadExecutor$”,“org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(”,“org.eclipse.jetty.util.thread.QueuedThreadPool$”,“”],“statementText”:“describe parsed;”,“entities”:[]}’

@Patrick_Druley - in the new dataset dialog what did you choose for the Topic and Table/Stream drop down?

If those are correct, then I don’t see why the system wouldn’t be able to pick up the correct stream/table.

Here’s an example of creating dataset from table with a name different from topic:



Agree, it does work if the topic name is different as long as Arcadia Instant can parse the topic name correctly which in case of my topic that returned an error “wikipedia.parsed” it can not. Even if you get the topic name parsing fixed to handle using dots (.) I think the bigger question is why is it necessary to expose topics in the first place. I could understand exposing the topics if you allowed topic inspection or the ability to create streams or tables from topics in Arcadia Data but I don’t believe those capabilities currently exist. At a minimum I would highly encourage adding the ability filter out internal topics (usually start with an underscore) by default and add a toggle switch in case the user really wants to see them (we currently do this in Control Center).

It’s also worth noting that KSQL has no way to group streams or tables other than by topic which is why I am assuming you used the same interface that would normally be used by a database schema that groups tables and views together. My suggestion would be to actually show all of the streams and tables first by default and then when a user selects a stream or table then show the underlying topic and message format and perhaps some sample data. I could see a future where KSQL allows for a grouping object that would leverage this existing interface well but currently it doesn’t make much sense to me.

@Patrick_Druley have you seen the “Direct Access” option?

That should allow you to create streams/tables from Kafka topic as you indicated. See example below:

Now when I go to the topic I can see the new stream I created:

1 Like

@shaun Yeah, I like that Direct Access option as it provides flexibility for both custom queries and creating streams and tables as you demonstrated. However, if you look at if from a user perspective the information a need to create a stream or table on an existing topic isn’t really available in Arcadia yet (value format and value fields). I would need some way to do topic inspection or read from the Confluent Schema Registry to understand the messages that are currently in a topic. I tested doing a KSQL PRINT command in the Direct Access editor and it didn’t work. If you can get that to work, that would be a fast path to topic inspection.

1 Like

@Patrick_Druley agreed, that is a feature we should add to make the workflow easier.

I am also facing similar issue. How to resolve this?

Error running query: Error: b’{"@type":“statement_error”,“error_code”:40001,“message”:"line 2:19: mismatched input ‘.’ expecting ‘;’

@raghuram what statement are you running?

I created two tables one table with push query and other table with pull query in KSQLDB.
Push query: create table table_name as select col1,col2,col3 from table_name1 where rowkey=‘col1’ emit changes;
Pull Query:- create table table_name as select col1,col2 from table_name1 WINDOW TUMBLING (SIZE 5 SECONDS);

When I am trying to select any of these tables is Arcadia and select on “Sample Data”, I am getting the above issue. I checked with Confluent team If there is any issue with the KSQLDB version part they told that I am using the latest version only and no issue with it.

@taddwood… Can you please help on this?

EMIT CHANGES queries I don’t believe are supported currently. The pull query should work though, but it’s possible errors could be related to newer forms of KSQL we’re not certified against. Currently we’re certified with KSQL 5.1.

@taddwood Thanks for your update. Is there a consideration the Arcadia get certified to newer forms of KSQL. If so can I know when?