Question Regarding Prompt #27

rahulranjan29 · 2024-10-28T07:41:56Z

Hi!

Firstly, thanks for this useful paper!

I had a question regarding the prompt instruction text. An example of the prompt from your paper is shown below. I can see that you are providing the historical data over 12 time steps as natural language text which is directly included in the prompt and are also encoding this historical data via Spatio-Temporal Encoder. I want to understand why do you have to provide the historical data twice: historical data directly as text and historical data via STE encoder? Isn't the encoded historical data via the encoder enough for the model?

Secondly, if historical data as Natural Language is important, then does it not limit the amount of historical data you can append to the prompt? Right now, it's just 12-time steps, but what if we want to provide historical data over 1000 steps, maybe more? In such a case the LLMs model won't be able to run over such a large context, right? I assumed that another advantage of STE encoder is that it would allow LLMs to ingest larger historical context for prediction because you could encoder 1000 time step data to a much compressed representation via the STE.

Given the historical data for crime over 12 time steps in a specific region of New York City, the recorded number of burglaries is [1 0 1 2 0 0 1 0 0 3 1 1], and the recorded number of larcenies is [4 5 3 2 2 2 3 4 2 3 0 4]. The recording time of the historical data is 'October 20, 2020, 00:00, Tuesday to October 31, 2020, 00:00, Saturday, with data points recorded at 1-day intervals'. Here is the region information: No description is available for this region. Now we aim to predict whether the two specific crimes will occur in this region within the next 12 time steps during the time period of 'November 1, 2020, 00:00, Sunday to November 12, 2020, 00:00, Thursday, with data points recorded at 1-day intervals'. To improve prediction accuracy, a spatio-temporal model is utilized to encode the historical crime data as tokens , where the first and the second tokens correspond to the representations of burglaries and larcenies. Please conduct an analysis of the crime patterns in this region, considering the provided time and regional information, and then generate the prediction of crime occurrence probability.

The text was updated successfully, but these errors were encountered:

Kaleemullahqasim · 2024-10-31T07:26:39Z

#24

First they trained the LLM and said they used zero-shot prompting even at the predictions step or i should say testing step they are still providing the history data.
the explanation they gave is the following:
"
we train LLM on one region but when testing it is totally a different region so in this regard it is zero-shot prompting
""

but the funny thing is they dont even need STE if they still need inject testing data into prompt, as you mention providing data twice, once in STE and again in prompt

or dont say this is zero-shot prompting when you inject history data at testing step

LZH-YS1998 · 2024-11-09T05:40:18Z

Hello, thank you for your attention! Including real numerical values in the text helps large language models better understand actual spatial-temporal situations. I strongly agree with your point that STE (Spatial-Temporal Embedding) encoding can compress information, thereby enabling more steps of spatial-temporal prediction. If you want to implement long-term prediction tasks, you can choose to compress the actual data (such as mean values, numerical ranges) or opt not to provide historical real spatial-temporal values.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question Regarding Prompt #27

Question Regarding Prompt #27

rahulranjan29 commented Oct 28, 2024

Kaleemullahqasim commented Oct 31, 2024

LZH-YS1998 commented Nov 9, 2024

Question Regarding Prompt #27

Question Regarding Prompt #27

Comments

rahulranjan29 commented Oct 28, 2024

Kaleemullahqasim commented Oct 31, 2024

LZH-YS1998 commented Nov 9, 2024