TiRex - Time Series mit State Tracking für die Robotik (embedded)

Shownotes

Wir haben einen neuen Podcast Partner: Unser Dank geht an die Hannover Messe

Fragen oder Ideen zur Robotik in der Industrie? helmut@robotikpodcast.de oder robert@robotikpodcast.de

Transkript anzeigen

00:00:00: Robotic in der Industrie, der Podcast mit Helmut Schmidt und Robert Weber.

00:00:09: Hallo liebe Zuhörer und Zuhörer, willkommen zu einer neuen Folge unseres Podcastes "Robotic

00:00:16: in der Industrie".

00:00:17: Mein Name ist Robert Weber und heute gibt es eine besondere Folge Hinweis.

00:00:23: Zuerst, ich bin ein bisschen Bayer, das wisst ihr alle, aber es gibt heute eine Folge mit

00:00:27: dem Professor Dr.

00:00:28: Sepp Hochreiter und der hat nämlich letzte Woche sein neues Zeitrein-Foundation-Modell

00:00:35: vorgestellt mit dem Namen Tyrex und hat damit sofort alle internationalen Leaderboards

00:00:41: erklaommen und ist dort auf Nummer eins vor den großen Alibabas, Amazons, Googles etc.

00:00:47: und dieses neue Zeitrein-Modell könnte auch für die Robotik sehr interessant sein, denn

00:00:54: er kann das sogenannte State Tracking.

00:00:57: Was das alles heißt, wie er das macht, welche Anwendungsfälle er da in der Robotik sieht,

00:01:03: hört ihr jetzt im Podcast mit meinem lieben Kollegen Peter, der die Folge aufgezeichnet.

00:01:09: Also viel Spaß und bis zum nächsten Mal.

00:01:11: Hi there, my name is Peter Sieberg and I am your host.

00:01:16: Today I am going to be talking to the one and only Zapp Hochreiter and Zapp and I are going to be talking about Tyrex

00:01:24: and Tyrex is the first XLSTN based time series Foundation Model.

00:01:31: Hi Zapp.

00:01:32: Hi.

00:01:33: How are you doing?

00:01:34: I am fine, I am very excited about because we are launching Tyrex, I am very excited.

00:01:40: Oh, that's great.

00:01:41: We are going to be talking about Tyrex in just a minute.

00:01:44: We have had you on our show actually two, three times.

00:01:50: And for other reasons I believe that at least 95% of our listeners will have heard about

00:01:56: you, but still maybe very quickly introduce yourself to our listeners.

00:02:01: Yes, my name is Zapp Hochreiter.

00:02:05: I am heading the Institute for Machine Learning here in Linz.

00:02:09: It's a JKU, Johannes Kepler University.

00:02:12: I am also Chief Scientist of newly funded AI company called NXAI and this company is

00:02:23: dedicated to bring AI to industrial applications, to bring AI to the machinery and to be a focus

00:02:32: in its moment at XLSTM, the new technique.

00:02:37: And I am known for inventing LSTM, LSTM stands for Long Short-term Memory and LSTM started

00:02:44: all this chatbot, chatGPT stuff because the first large language model was an LSTM model

00:02:50: and I am known for LSTM.

00:02:52: Great.

00:02:53: Thank you very much.

00:02:54: Last time that we met was actually in Linz.

00:02:56: You referred to Bose, your company, new company NXAI, which you are a co-founder of, as well

00:03:04: as where the Johannes Kepler University is.

00:03:07: Yeah, you already referred to LSTM.

00:03:11: I dare to use the quote "great thinkers stand on the shoulders of giants" and even if it

00:03:19: was themselves.

00:03:21: So why maybe don't you quickly take us by the hand, look back at your, I don't know,

00:03:28: maybe 30, 35 years of AI research.

00:03:33: Maybe you want to tell us what were the main stations that brought you then to XLSTM.

00:03:38: Yes, I invented LSTM 1991 in my diploma thesis.

00:03:45: I first analyzed the vanishing gradient, which is a common problem in deep learning, which

00:03:52: you have to overcome to build large models.

00:03:55: And I proposed LSTM architecture for the current neural networks, for neural networks,

00:04:00: which can process time series, which can process text.

00:04:05: But then neural networks were not popular anymore.

00:04:08: In the community support vector machines came, even we had problems to publish LSTMs.

00:04:16: And then starting in 2006, deep learning came, starting in 2010.

00:04:22: LSTM became very popular, all Text and Speech programs on cellphones, we are LSTM based.

00:04:33: It has many, many LSTM applications, the same with Amazon and you name it Microsoft or

00:04:40: so.

00:04:41: But then it turned out 2017, there's another technique, it's called transformer, where

00:04:48: the tension mechanism is built in, that these architectures are better in parallelizing.

00:04:56: You can push more data through these models in training than you could in LSTMs.

00:05:05: If you have the first LSTM, the first large language models were based on LSTM, but this

00:05:11: parallelization to get more training data at the same time pushed LSTM from the market

00:05:17: and transformer was used.

00:05:20: And I always thought, hmm, can we not scale up LSTM like transformers?

00:05:27: Can we not do the same?

00:05:29: Can we not build large models?

00:05:30: Can we not make it faster?

00:05:33: And with XLSTM, we achieved this.

00:05:35: We looked into it, we copied some of the tricks of the transformer technology, added some

00:05:43: of our own tricks from the LSTM technique and then published this LSTM technology, which

00:05:52: is a model, which is based on original LSTM, but can be paralyzed and has some other tricks,

00:06:03: which make it really, really powerful.

00:06:05: And we showed it can achieve the performance of transformers in large language modeling.

00:06:12: We will show soon that we are the same level as transformers.

00:06:16: Right.

00:06:17: We may be coming into a little bit more detail later on in this comparison of the transform

00:06:23: and XLSTM technology.

00:06:25: But our topic today is time series.

00:06:27: Now, am I correct in assuming that until recently with regards to XLSTM, as you just introduced,

00:06:36: you have been concentrating on language.

00:06:39: So, how is time series data different from non-time series data?

00:06:45: So the data that does not have any timestamp from the perspective of the researcher Sepolreter.

00:06:52: Yes, yes.

00:06:53: Yes, first of all, for me, there's not a big difference.

00:06:57: If you give me a sequence, I can use every time series method because I can assign to

00:07:04: the sequence element time points and I can analyze sequences like DNA or even text.

00:07:11: Also might be a sequence from assist.

00:07:13: There's not a big difference, but the data is different.

00:07:18: If you look at text, there are coordination between words, which are far away and this

00:07:25: is a more abstract symbols, your process.

00:07:30: In timeseries in most cases, you have numerical values, you have numbers or vector and you

00:07:38: process numerical values.

00:07:40: And often in timeseries, the data comes out of a complex system.

00:07:45: The system has something like a hidden state.

00:07:48: It's about in what state is the system and then you want to predict the future or you

00:07:55: want to classify what's happening right now.

00:07:58: This is a difference between abstract symbols, which have some meaning and numerical values,

00:08:06: which came out of a complex system with hidden states.

00:08:09: Right, so referring to the systems, maybe you can give us a couple of examples.

00:08:15: Timeseries are being used in a variety of very different markets.

00:08:20: Maybe you can give us a couple of examples of use cases and markets where the typical

00:08:25: timeseries data comes from.

00:08:29: Timeseries pervasive, so everywhere you find them everywhere and you encounter them everywhere.

00:08:36: If you think about weather forecasting, if you drive with your car and the navigator

00:08:43: tells you as estimated to time of arrival, it's a timesherring, it's forecasting.

00:08:48: If your system tells you when the battery, if you have an e car, is empty, it's a time

00:08:54: series problem, but it's in stock market prediction, in predictive maintenance, in

00:09:01: logistic you have to predict.

00:09:03: When do you have to order new parts, set your production, that's still.

00:09:09: Or when your machinery needs new oil, you have to predict some market.

00:09:16: For example, if you produce something for the car industry, you have to predict how many

00:09:22: cars will be sold to a trustier production.

00:09:27: Very prominent was Amazon.

00:09:31: They have all across timeseries prediction because they have to predict two things first,

00:09:39: how much a product is bought and also how long does it take to deliver it.

00:09:45: Because they have some deliveral things, say a better in predicting how well a product

00:09:50: is sold, so producers themselves.

00:09:54: Amazon is one prediction company and the whole business model is built on a prediction.

00:10:00: But you need it for climate, you need it for medicine, there's EEG and EKG, there are so

00:10:08: many predictions.

00:10:10: You want to know how the party is responding to treatments or during a surgery.

00:10:18: As applications in agrar, if you do some corn or apples or whatever, you have to predict

00:10:27: the weather, you have to predict the soil condition.

00:10:31: A very famous application, where we are very good, is hydrology to predict floodings.

00:10:40: Because here, if it's raining, we have these hidden states, the rain goes into the soil,

00:10:47: goes underground, basins, underground basins, and you have to memorize how full are these

00:10:54: basins, because if they are full, the rain directly will go into the river.

00:10:59: Assobuy the underground basins will be filled up before the water goes into the rivers.

00:11:04: This is a very, very prominent example of how we do earth science in climate change,

00:11:11: where you need this forecasting all the time.

00:11:13: You need forecasting in energy, smart grids, you have to predict the weather for solar

00:11:19: energy, wind energy, and you have also to predict the customer behavior.

00:11:24: If there's something like a football game, like Germany is in the final, everybody turns

00:11:31: on the TV and puts a beer into the fridge or whatever.

00:11:35: This is a couple of examples, but there are so many, many, many more.

00:11:39: It's everywhere, it's really everywhere.

00:11:41: Ja, really.

00:11:42: We hear you and I'm sure you could go on for a couple of minutes.

00:11:45: Yeah, so very good.

00:11:46: You gave a couple of examples of the specific area.

00:11:50: We have a main interest in here in our podcast in the industrial environment.

00:11:56: Since when then, looking back these, whatever, 35 years, since when have you been looking specifically

00:12:03: at time series?

00:12:04: This is from the moment that you came up with LSTM.

00:12:08: If that was the case, what were until then the main algorithmic capabilities will come

00:12:15: to the new ones later on, but what were the standards in the past that were capable of

00:12:20: looking into the future of time series?

00:12:23: Yes, I started in Kindergarten.

00:12:26: I was always interested to predict the future, but now kidding, this LSTM, the first LSTM

00:12:32: applications were time series because text was not available.

00:12:36: We never thought about doing text LSTM and where I come from.

00:12:42: Only time series were in our mind.

00:12:45: An LSTM has been designed for time series, so old original LSTM and it performed very

00:12:52: well.

00:12:53: LSTM is used everywhere.

00:12:56: Even one guy from Google told me LSTM is still used in Google Translate because its faster

00:13:03: than the transformer architecture in InfoRence, in applying it.

00:13:07: But LSTM were in many, many industries, in many, many broad domains, in industry for prediction.

00:13:16: I gave a couple of applications, but there are many more.

00:13:21: LSTM was good there.

00:13:23: Alternatively, there were models like RIMA, statistical models, so I only do this local

00:13:30: averaging.

00:13:31: Meaning, you make an average over the last values or you calculate a trend or something

00:13:39: like this.

00:13:40: This was typically for stock market predictions with traditional statistical methods.

00:13:45: And LSTM was better because LSTM could memorize stuff and it could memorize in what state some

00:13:53: system is.

00:13:54: I brought you a hydrology thing here.

00:13:57: If it's snowing, the snow does not go to water.

00:14:00: So the snow is stored, the snow lying on the soil.

00:14:04: And if the sun shines, the snow tries into water.

00:14:07: And this is something like storing water, also in the Gletscher underground presence.

00:14:12: Some systems also, the sea, if there's a storm, it's a sea, you don't see it.

00:14:18: But there's a hidden state because in the sea, under the water, still a lot of food is in

00:14:24: the water because of the storm before.

00:14:27: In fish eating this, there are these hidden states everywhere.

00:14:31: And these statistical methods were not good to capture the hidden states because they

00:14:36: do it on your averaging.

00:14:37: LSTM was very good to capture the hidden states of some systems.

00:14:42: Think about a pipe, you have a water pipe, you open something, water is flowing.

00:14:48: But on the other end, it takes a time until the water arrives.

00:14:53: But you have to memorize, yes, I opened the water pipe.

00:14:56: I opened and the water is flowing.

00:14:58: Sister hidden states.

00:15:01: Very good.

00:15:02: Now you have come with a new time series Foundation model called Tirex.

00:15:09: The king of time series, I assume that's what you want to convey with that.

00:15:13: And it's based on XLSM.

00:15:15: You just introduce XLSM in the comparison with a transformer.

00:15:19: But what are the main features?

00:15:21: What is the USB of Tirex?

00:15:25: Tirex, indeed, it's the king of time series.

00:15:29: It's the king of time series models.

00:15:31: First of all, it's based on XLSM.

00:15:34: And I already told you, so original LSTM is very, very good in time series prediction.

00:15:40: Now we improved it.

00:15:42: But it still kept its super performance in time series prediction.

00:15:48: It's very good.

00:15:49: But with all these tricks of the transformers, it became even more powerful.

00:15:56: And this is a time series Foundation model.

00:15:59: What does this mean?

00:16:00: This is a new kind of time series prediction, which come out of this large language models

00:16:08: because of the in context learning.

00:16:10: For large language models, you can write something in the context, give some questions

00:16:16: or give some examples, and then the large language models is processing this and gives you an answer.

00:16:22: Here's the idea.

00:16:23: I train a very large model on many, many different time series.

00:16:28: And then I give a new time series in context.

00:16:31: It's like a prompt.

00:16:32: It's like a question.

00:16:33: But in this case, only numerical values.

00:16:37: It's a type series.

00:16:38: And then you say, can you give me the future?

00:16:41: Can you give me the next time point or the next 10 time points?

00:16:44: Or can you give me what's happening in 100 time points?

00:16:48: And this idea of the large language models has so much knowledge.

00:16:56: And this time series Foundation models have so much knowledge about time series.

00:17:04: Since I don't have to learn new time series, but I already see patterns.

00:17:09: I saw an other time series, and if we give a prefixes, a beginning of a time series,

00:17:15: for them it's clear, yes, as a future will look like this.

00:17:18: Here we have a very, very, hutsing CIS Foundation models.

00:17:22: First of all, so allow non experts to use high quality time series models.

00:17:29: You have no idea about time series.

00:17:31: You put it in context, all your values, and you get good prediction.

00:17:34: Wow, you don't have to know anything about time series or deep learning.

00:17:40: That's the first big advantage.

00:17:42: So second big advantage is, if you don't have enough data,

00:17:47: then you cannot learn a model for your particular domain or time series.

00:17:53: But this Foundation model, you only give the beginning of your time series.

00:17:57: And you don't have any data.

00:17:59: You don't have training data, but the model already makes good prediction.

00:18:03: So, Jeffins, perfectly suited for tasks where not enough data is available.

00:18:11: Okay, very good.

00:18:13: So, what about, so this is like about the quality, maybe the use, we come to that in a moment.

00:18:19: At the very end, we're going to be looking at some benchmark numbers, maybe do some comparison as well.

00:18:26: But before then, if you compare, what about the size of the model?

00:18:31: What about the speed of the model in relation to other solutions in the market?

00:18:35: Okay, go later to numbers, but compare to other solutions.

00:18:39: I have to mention other solutions.

00:18:41: Almost other competitors in this domain, meaning time series Foundation models,

00:18:49: are based on the transformer technology, because it's so popular,

00:18:53: it's so successful in large language models in Jegebelky, Innoit.

00:18:58: And they have a problem, they have a problem, because they are typically very large

00:19:04: and they are typically very slow.

00:19:08: For example, if you give a time series, as I said in context,

00:19:12: they always, for every prediction, they have to go over the whole time series again and again.

00:19:17: So super slow.

00:19:19: What we achieved is two things.

00:19:22: First of all, our model is small.

00:19:25: Our model has, because it's based on X-Listam, a fixed memory.

00:19:31: So far, it's perfectly suited for embedded systems at edge devices,

00:19:36: which transformer cannot do.

00:19:38: And we are super fast.

00:19:41: We are super fast because of two reasons, because we are small.

00:19:44: Okay, if we are small, we are faster, because we don't have to do so much computations.

00:19:49: But because it inference transformers a quadratic in the context length,

00:19:56: in the length of the time series you give in context.

00:19:58: And the L-Listam is linear, because it only accesses the memory.

00:20:03: It's better.

00:20:04: Well, it's faster, it's much faster.

00:20:07: It's small and faster.

00:20:08: And now, the most important thing is,

00:20:12: it's even better in prediction quality, in forecasting quality, because the XLSM we use

00:20:20: is able to do stage tracking. I told you there are states like in hydrology, if you want to predict

00:20:28: how much water is in your river, there are these hidden states, water is in the snow, water is in

00:20:33: the soil, water is in the underground basins, and you have to keep track of this, you have to memorize

00:20:39: it, you have to track out its raining, but the water is going into the soil, but it will flow out

00:20:44: later, and CISAR states, CISAR hidden states of the system, also in the robotics state would be

00:20:52: where's your robot arm, you can memorize what movements you have done and where your robot arm

00:20:58: is located, and LSTM can do that, but transformers or these fast models like RWKV or Mamba, these

00:21:11: models which came out cannot do the stage tracking, cannot keep track or cannot monitor

00:21:18: in which state your system is, and that's so important, and therefore we are in many times here

00:21:25: is so much better because we can do stage tracking, we can memorize in what state a complex model is,

00:21:33: and to come to the competitors, our competitors are something like Kronos from Amazon,

00:21:39: Times FM from Google, Moray from Salesforce, Toto from DataDoc, and also Alibaba,

00:21:53: the Chinese company put some new foundation models for time series only a couple of days ago

00:22:01: into the hacking phase leaderboards, and just the big companies, they devoted a big team to get good

00:22:08: models, and we are considerably better, we are clearly better than all these methods, because

00:22:16: we have an advantage, because we can do the stage tracking, and it's not only a small difference,

00:22:21: it's a clear difference where we are better, and all these big companies could not keep up with us

00:22:28: because it's a technology, it's our technology, it's a NXCI technology, it's European technology,

00:22:35: which has beaten everything else, and we are not only better in forecasting, as already said,

00:22:43: we are faster and we are smaller, and this is fantastic, that's unbelievable, we are better,

00:22:50: faster, smaller, and we are so happy, we are so excited, that we are clearly in front compared

00:22:58: to these teams of these big companies. That's great, we can really feel your excitement,

00:23:05: Zepp, that is really great, higher quality, more speed, smaller, what does that mean, you

00:23:11: already refer to edge as a potential, maybe give us a couple of typical use cases where you see

00:23:20: Tirex to be applied. Tirex should become a standard, if you do some time series workouts,

00:23:28: then add some machinery, if you have a small device, and you want to know what's happening on

00:23:35: your machine, you do the better control stuff, you should use this, because add some machinery,

00:23:41: you have to be fast to interfere fast enough, and you have to be small, because you cannot

00:23:47: put a big computer besides your machinery, small and fast is important, and being good

00:23:53: is also an advantage, or in process control, like a digital twin, you have a simulation,

00:23:59: and you do prognosis, you do forecasting of your system, like the heat, is it too hot at some point,

00:24:09: if it's too hot, if the forecasting said it will become too hot, you have industrial process,

00:24:14: you have this small device on the side with Tirex in it, Tirex says, hey, stop, it's becoming too

00:24:22: hot, send you a regular down, or Tirex tells you as a catalystator is not well distributed, because

00:24:30: the forecasting can predict the distribution of the catalyst or some chemical material in your

00:24:37: process, says, hey, we have to change this, give more of it, or whatever, and this is important,

00:24:44: because this has to be in real time, if you want to steer the process, if you want to control the

00:24:51: process, it has to have real time capacities, it has to be small, because they have to fit into a

00:24:57: small device, an embedded device in your production system, but also Tirex, you will see it in

00:25:04: autonomous driving, because in cars, you have to predict when is the battery empty, and there are

00:25:10: many prediction things, you will see it in drones, if you have to predict it, you will see it in all

00:25:17: autonomous systems, especially in autonomous production systems, because Tirex is good, Tirex,

00:25:26: I mean, it's a quality of prediction, is small, it fits on small devices, and it's super fast,

00:25:32: yes, that's ideal for industry, industry should jump on it. Exactly, and I'm so happy, and I'm

00:25:40: sure that many listeners are so happy hearing exactly this, it's almost like as if you have

00:25:47: produced, you know, we started working three, four years ago, and now you come with this

00:25:52: great solution, almost as if it was specifically made for our audience, so to say, very good, so

00:26:00: you already referred to, it will be telling you, who is the you, I mean, you refer to the continued

00:26:07: state tracking, but also about the context learning specifically, so what does that mean,

00:26:13: who is going to be the typical user, is that changing, is it more the data scientist type of

00:26:19: very knowledgeable person, or does it mean that you're going to have like typically the domain

00:26:24: expert being capable of using solutions that are going to be based on Tirex? That's a good thing,

00:26:31: because you don't have to be an expert anymore, because you download your Tirex, you feed your

00:26:38: numbers, your time series into the context, and you get a prediction, and the prediction

00:26:43: is as good, and in most cases even better, than if you would build a model, use also expert knowledge

00:26:53: and time series research, and do a prediction, that's super good, because now time series prediction

00:27:00: is open for everybody, but even better, even better, assume you are a company, and you sell

00:27:08: a device to different customers, every customer says, can you adjust the device to my needs,

00:27:15: can you adjust the device to my environment, or to my product, or whatever, and then you

00:27:21: need somebody who is mind-tuning the time series prediction model, or as a forecasting model for

00:27:27: each customer. If you use Tirex for example, you put Tirex on it, on the machinery, it goes to the

00:27:35: customer, as the customer starts the machinery, Tirex will suck in the data from the machinery,

00:27:45: and put it in context, and it's doing prediction, and if the customer has a new product, Tirex will

00:27:51: get in the data for the new product, or the new use of the machinery, and can do prediction,

00:27:59: if the machinery is worn out or changes its behavior, Tirex can get in the actual data,

00:28:06: and do prediction, and you sell something, and you don't have to care about it, because Tirex

00:28:14: can adapt to all changes, because it can automatically load time series into the context,

00:28:23: and track the machinery, track the use of the machinery, and you don't have to do anything

00:28:28: anymore, as a company is selling machinery with time series forecasting in, built in,

00:28:34: in the machines you're selling. That's really great here, it's a direction that I've been

00:28:40: looking at and expecting, almost like for quite some time, that domain experts are going to be

00:28:47: using their data, the data that they've been producing, but were never capable of doing

00:28:54: something with themselves, always needed to go to other people, third parties or in-house.

00:29:01: Now, you gave a general example of a company selling devices, now what is going to be the type

00:29:09: of Tirex customer, what kind of product or service are they going to build on top of Tirex,

00:29:18: or are they going to be using Tirex directly, and maybe you want to tell us then in relation to

00:29:23: that, what is going to be the type of license that you're going to put Tirex onto the market.

00:29:28: First of all, Tirex is a base model, we will put on Hackingface to show everybody

00:29:36: that we have better, better sensors, Amazon guys, Google guys, Salesforce guys, Data guys,

00:29:43: Alibaba guys, you name it, a better sensor, American sensor, Chinese, so we have to go out,

00:29:48: but what we can do then is to fine-tune, so base model can do every time series,

00:29:56: but if you have enough data in one domain, you can a little bit tweak and you always get better

00:30:02: in this specific domain if you are trusted, and there are tricks, how to do fine-tuning,

00:30:09: how to trust it to a specific application, so you get better. So, basic model is already better

00:30:17: than specific models used by statistic guys, what is used right now, but you can get even better

00:30:26: if you do fine-tuning, fine-attachment if you go into your domain, and this would be customers,

00:30:34: you may say we have the space model, but we can adapt it to your use case and you get even better

00:30:41: performance, perhaps you get even faster, you can address it to your hardware, to your chip,

00:30:48: to your embedded device, and here as a customer will pay us, hopefully said we adapt this super

00:30:59: cool model, it's super strong to their hardware, their specific applications. Very good, talking

00:31:07: about the specific data, I understand, so there's going to be, I don't know, there's going to be a

00:31:13: hydraulic model, there's going to be a whatever type of machine robotic model, etc. Now the model

00:31:21: that you come with, which is already very powerful, that was based on available public data or maybe

00:31:29: also on data from companies that you have been working with in specific industrial segments?

00:31:35: Right now it's only based on public data, it's important because otherwise we would have a

00:31:42: license problems, it's based on public data, and here a nice thing is a couple of days,

00:31:48: a new model came out, it's called Toto from DataDoc, a big American company, and they had one

00:31:58: trillion internal data, additional to the public data we are using, and we're still better, that's

00:32:07: like a joke, because they used internal data to build their model, additional to the data we

00:32:14: have available, imagine if we would have all the data the companies have internally, we're beating

00:32:22: them, but what model we would build if we would have also access to this data, it would be unbelievable,

00:32:30: and here we hope that we get more data from our industrial partners to even build on top of this

00:32:39: Tyrex model, even better, more specific models, like multivariate data, we already have ideas how

00:32:47: to make it multivariate, stuff like this, but here for buildings we need good data, and we are

00:32:55: right now collecting data, we are right now asking different partners, can we collaborate to build even

00:33:03: stronger time series models, but we are so strong already, but we are looking into the future.

00:33:08: You can become even better, and I'm sure there's going to be hundreds, if not thousands of listeners

00:33:14: of companies that are going to be very, very much interested in using one way or the other

00:33:20: their data in combination with your Tyrex. Okay, let's look a little bit at the numbers, you refer to

00:33:27: two or three competitors, let's say in the market already, maybe you want to share with us what are,

00:33:34: what is the number one, or maybe number two, three time series benchmarks, and you refer to two,

00:33:41: three potential competitors, and maybe you want to tell us then how Tyrex is performing relative to them.

00:33:50: Yes, it's a little bit complicated because there are some evaluation measures, and if you're not

00:33:56: familiar, there are only numbers for you. Let's say if we go back to the status we were seeing,

00:34:04: now there are new submissions, there's one measurement method, it's called CRPS, it's about

00:34:12: probabilistic forecasting, where you not only say one point, but you say some, like an interval,

00:34:19: you know how good it is, and there was with these numbers, the smaller numbers are better,

00:34:26: the CRONOS had 0.48, CRONOS is from Amazon, times FM from Google has 0.46, TAP, PFN, that's a

00:34:39: method from Franco to in Freiburg has 4.8, all these methods, foundation methods, there's also

00:34:48: Morai from Salesforce, Salesforce invests a lot into time series, it was about 0.5, and you see

00:34:59: all lined up at 4.6, 4.7, 4.6, 4.7, and we get into the same measurement, 4.1.

00:35:08: There's a big gap, it's all big companies are competing at the level of 4.6, 4.7,

00:35:17: and with our first submission, we got 4.1, you see it's a gap, another method, another criteria

00:35:26: would be the rank, you don't do the evaluation on one time series, you go over many, many time

00:35:33: series, and then you want to see how good you are on what rank you are, what is the average rank,

00:35:40: perhaps you're one second, then you're third, then you work first, and you give the average rank,

00:35:46: and if you do this average rank on what place you are, we get for Tyrex, we got on average 3

00:35:55: over many, many, many methods, also specialized methods, and the next best method like time

00:36:03: is 6 on average, is on place 6 on average, Kronos is on place 7 on average, Morai is also on place

00:36:11: 7 on average, we are on 3, and the next ones are on 6, you see there's a big gap, whether you

00:36:19: measure directly the performance, the prediction performance, while you rank the method say on

00:36:25: what place you are, are you first or second, and then average over the places, we are also with

00:36:31: a big gap better than all others, it's so fantastic, we couldn't believe it, that we are performing so,

00:36:38: so good, and the reason for this is the technologies that you refer to, continued

00:36:45: state tracking, context learning in combination with on top of XLSDM in relation to all the other

00:36:52: ones are transformer based or is that not necessary, all others now transformer based,

00:36:58: because transformers are so popular, but in industry, in practice, LSDM performed very well,

00:37:05: LSDM was always strong in time series, but more than transformer based methods, but this is now

00:37:12: in context learning, said you don't learn, which is known for large language models,

00:37:18: and therefore everybody jumped onto transformers, because we know transformers can do this,

00:37:23: if it was not clear, can LSDM or XLSDM do this, said XLSDM can do it, it was for me, it was clear

00:37:31: because we are also in language, but here it went through the roof with this performance.

00:37:37: Very good, congratulations, it sounds really, really impressive, before we're going to close

00:37:44: off, why don't you share with us maybe where your base, where's your team, both for NXAI as well as

00:37:52: your job at the Johannes Kepler University, maybe you're looking for new colleagues,

00:37:58: there's jobs open maybe, and if so what should interested people bring?

00:38:02: Yes indeed we have jobs open, we are located in Linz, both as a company NXAI is in Linz,

00:38:11: also my institute at the university is in Linz, we are looking always for very motivated,

00:38:19: interested researchers, but also developers, it's such an exciting field, believe me, if you join

00:38:27: us you will have fun, it's just great to do it and also many success stuff, what we also offer is

00:38:35: dual systems that you can also work from home half of the time or something like this,

00:38:41: this would be a negotiated and we have a very inspiring environment, many researchers, many

00:38:49: new ideas, everything is on fire. That's amazing, maybe I'll consider applying for a job with you,

00:38:57: no I will not, but you dear listener, I'm sure that there's going to be many many people

00:39:04: and I think the most important thing is you kind of, but also you are too modest, but we can all

00:39:11: feel your excitement and we here, we heard again today with the great technology coming from you,

00:39:19: coming from Linz from also coming from Europe, so and I can only support you and suggest any

00:39:27: interested party person listening to be contacting you, so Zapp thank you very very much again,

00:39:35: as I suggested before it feels almost like you are now so close to our industry, to our industrial

00:39:43: environment here, we are very very much looking forward to seeing solutions based on Tirex,

00:39:50: the time series foundation model that is better, smaller and faster, thank you very much Zapp and

00:39:57: looking forward to see you soon in the Alps again. Yes, it was a pleasure and please check out Tirex,

00:40:05: it's rewarding. Thank you Zapp, bye bye. Bye bye, ciao.

00:40:20: (gentle music)

00:40:22: [BLANK_AUDIO]

Shownotes

Transkript anzeigen

Neuer Kommentar