Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query timeout option (SQL_ATTR_QUERY_TIMEOUT) #515

Open
Odraio opened this issue Oct 20, 2022 · 16 comments · May be fixed by #517
Open

Query timeout option (SQL_ATTR_QUERY_TIMEOUT) #515

Odraio opened this issue Oct 20, 2022 · 16 comments · May be fixed by #517
Labels
feature a feature request or enhancement
Milestone

Comments

@Odraio
Copy link

Odraio commented Oct 20, 2022

Issue Description and Expected Result

Previously the RODBCext package provided an option to set the query timeout. As of 2020 the package is deprecated and the better alternative is odbc (cran).

This odbc package doesn't seem to provide the described option. I was wondering if there is another way around to implement a query timeout while using odbc (perhaps something like the C code of the RODBCext package which sets the value of SQL_ATTR_QUERY_TIMEOUT).

(R does have a withTimeout function, but I want to have a reliable/ correct implementation)

The nanodbc.cpp file does have a timeout function which sets the required SQL_ATTR_QUERY_TIMEOUT attribute, but this doesn't seem to be available in the R code?

Database

Microsoft SQL Server

Reproducible Example

con <- dbConnect(
odbc::odbc(),
driver = "SQL Server",
server = "SERVERNAME",
trusted_connection = TRUE,
timeout = 5,
encoding = "Latin1")

samplequery <- dbGetQuery(con, "
SELECT * FROM [Something]
")

@aryoda
Copy link
Contributor

aryoda commented Oct 22, 2022

nanodbc has a quite good API for that (just use the timeout argument in execute(), prepare() and others...):

https://nanodbc.github.io/nanodbc/api.html

Shouldn't be to hard to implement...

The implementation in https://github.com/r-dbi/odbc/blob/main/src/odbc_result.cpp does not pass a timeout
value to nanodbc so the documented default query timeout applies:

timeout – The number in seconds before query timeout. Default 0 meaning no timeout.

@Odraio Currently there is no query timeout (= run forever?). What is the intended purpose of the timeout (stop long running queries?)

@Odraio
Copy link
Author

Odraio commented Oct 22, 2022

@aryoda Thank you for your quick reply!

It is indeed the purpose of having a query timeout to stop long running queries (or stored procedures). I did have a look at the source code of the odbc package and the nanodbc API to understand the current timeout implementation. In addition, I made my best effort to create a separate branch which contains the (concept) implementation of a query_timeout option.

Could you please have a look at the branch? I would really appreciate it: main...Odraio:odbc:feature/query_timeout

(I'm still unsure about passing the query_timeout from dbGetQuery to nanodbc. This may be incomplete
-> file: odbc_result.cpp - function: void odbc_result::execute())

If something is wrong or incomplete, could you please send me a message and/or modify the branch?
Thanks again for the support.

@aryoda
Copy link
Contributor

aryoda commented Oct 22, 2022

@Odraio Wow, I love this strong commitment to open source (not just asking but actively contributing :-)

Please give me some time (I have multiple open roadworks at the moment and want to avoid bigger context switches)

@detule detule added the feature a feature request or enhancement label Oct 23, 2022
@Odraio
Copy link
Author

Odraio commented Oct 24, 2022

@aryoda, @detule, @jimhester I've checked and tested the changes multiple times, but the next line of code doesn't seem to use the query_timeout of the statement. (Atleast long running queries are not stopped after the specified timeout). Probably a small detail is missing, but I can't pinpoint the issue/ missing part.

I was wondering if you'd like to take a look at the described line of code.

// part of void odbc_result::execute() -> doesn't use/ pass the query_timeout of the prepared statement?
r_ = std::make_shared<nanodbc::result>(s_->execute());

// function to fill s_
void odbc_result::prepare() {
  s_ = std::make_shared<nanodbc::statement>(*c_->connection(), sql_, query_timeout_);
}

@aryoda
Copy link
Contributor

aryoda commented Oct 24, 2022

From a first look I'd guess that for a prepared statement not only the "prepare" statement needs a query timeout (looks like it already has in the above code) but also the actual execution of the query with execute(). Does the execute() signature here allow you to pass the query timeout too? Eg.:

r_ = std::make_shared<nanodbc::result>(s_->execute(query_timeout_));

@Odraio
Copy link
Author

Odraio commented Oct 24, 2022

@aryoda I'm not sure which execute() function is called of nanodbc:

L3505 - result execute(connection& conn, const string_type& query, long batch_operations, long timeout)
L3517 - result execute(statement& stmt, long batch_operations)

(Is the timeout parameter missing on L3517?)

I've tried calling a specific method, but the query_timeout doesn't seem to have any effect:

r_ = std::make_shared<nanodbc::result>(nanodbc::execute(*c_->connection(), sql_, 1, query_timeout_));

@aryoda
Copy link
Contributor

aryoda commented Oct 24, 2022

I mean the line 51 in this code:

https://github.com/Odraio/odbc/blob/41b2a9a135e7666cdf2730b3bd408b2fc1b41650/src/odbc_result.cpp#L48-L61

Your links show nanodbc files which shouldn't be modified.

But I just saw that prepared statements do not have a query timeout argument in nanodbc's execute:

https://nanodbc.github.io/nanodbc/api.html#_CPPv4N7nanodbc7executeER9statementl

So perhaps setting the query timeout in prepare is enough:

https://nanodbc.github.io/nanodbc/api.html#_CPPv4N7nanodbc7prepareER9statementRK6stringl

Edit: You could also try to set the query timeout explicitly after preparing a statement by calling the timeout method of the prepared statement, see https://nanodbc.github.io/nanodbc/api.html#_CPPv4N7nanodbc9statement7timeoutEl

PS: Without cloning, building and debugging it is difficult to say for me what exactly is missing ;-)

@Odraio
Copy link
Author

Odraio commented Oct 24, 2022

@aryoda Thanks for pointing me in the right direction, I've managed to create a working solution! 😃
I'll later on improve the documentation and extend the tests of the odbc project.

@aryoda
Copy link
Contributor

aryoda commented Oct 25, 2022

I have done some testing with your published query_timeout feature branch and the query timeout does already work partially but I cannot fetch any data from the DB. I am using MS SQL Server on Window.

Can you fetch the result set of a select query without getting an error?

I also cannot see the query_timeout in the argument list (intellisense in RStudio).

I think inheritance from the DBI package may be a reason for that.

Here is my testing code together with relevant stack traces:

library(odbc)

con <- odbc::dbConnect(odbc::odbc(), .connection_string = "put your con string here")

odbc::dbSendQuery(con, "WaitFor Delay '00:00:45'")
# <OdbcResult>
# SQL  WaitFor Delay '00:00:45'
# ROWS Fetched: 0 [complete]
# Changed: 0

# !!! proves that query timeout works !!!
odbc::dbSendQuery(con, "WaitFor Delay '00:00:45'", query_timeout = 10)
# Error: nanodbc/nanodbc.cpp:1655: HYT00: [Microsoft][SQL Server Native Client 11.0]Query timeout expired

# does fail with and without query_timeout argument!
odbc:dbGetQuery(con, "select 'hello' as world")
# Error in odbc:dbSendQuery(con, "select 'hello' as world") :
#   NA/NaN argument

# does fail with and without query_timeout argument!
res <- odbc::dbGetQuery(con, "select * from test_table")
# Error in result_fetch(res@ptr, n) :
#   nanodbc/nanodbc.cpp:3069: 07009: [Microsoft][SQL Server Native Client 11.0]Invalid Descriptor Index
# 7. # stop(structure(list(message = "nanodbc/nanodbc.cpp:3069: 07009: [Microsoft][SQL Server Native Client 11.0]Invalid Descriptor Index ",
#                     call = result_fetch(res@ptr, n), cppstack = NULL), class = c("nanodbc::database_error",
#                                                                                  "C++Error", "error", "condition"))) at RcppExports.R#81
# 6. # result_fetch(res@ptr, n) at Result.R#59
# 5. # dbFetch(rs, n = n, ...)
# 4. # dbFetch(rs, n = n, ...) at Connection.R#358

odbc::dbDisconnect(con)

@Odraio
Copy link
Author

Odraio commented Oct 26, 2022

@aryoda

Thank you for reviewing/ testing the branch. I've executed the same tests, but all succeeded (using MS SQL Server).
I don't get it why it fails on the dbFetch within your tests, because this hasn't been changed.
The only difference in our test code is the encoding, "Latin1".

Test code:

library(odbc)

print('################# START TEST')

conn<- dbConnect(odbc(), .connection_string = connectionstring, encoding = "Latin1")

# Simple SELECT query without query_timeout
print('test_result0')
test_result0 <- odbc::dbSendQuery(conn, "SELECT 'hello' as world")
dbClearResult(test_result0)

test_result0 <- odbc::dbSendQuery(conn, "SELECT 'hello' as world", query_timeout = 10)
dbClearResult(test_result0)

# Simple SELECT query without query_timeout
print('test_result1')
test_result1 <- odbc::dbGetQuery(conn, "SELECT 'hello' as world")
print(test_result1)

# Simple SELECT query with query_timeout
print('test_result2')
test_result2 <- odbc::dbGetQuery(conn, "SELECT 'hello' as world", query_timeout = 10)
print(test_result2)

# SELECT * query without query_timeout
print('test_result3')
test_result3 <- odbc::dbGetQuery(conn, "SELECT * FROM dbo.TestTable")
print(test_result3)

# SELECT * query with query_timeout
print('test_result4')
test_result4 <- odbc::dbGetQuery(conn, "SELECT * FROM dbo.TestTable", query_timeout = 10)
print(test_result4)

odbc::dbDisconnect(conn)

print('################# END TEST')

Print results:

[1] "################# START TEST"
[1] "test_result0"
[1] "test_result1"
  world
1 hello
[1] "test_result2"
  world
1 hello
[1] "test_result3"
  TestColumn1 TestColumn2
1  TestA       TestB     
2  TestC       TestD     
[1] "test_result4"
  TestColumn1 TestColumn2
1  TestA       TestB     
2  TestC       TestD     
[1] "################# END TEST"

@aryoda
Copy link
Contributor

aryoda commented Oct 27, 2022

@Odraio Could be a build or infrastructure problem on my test system. I will check my infrastructure/setup and try it again with CRAN odbc, self-built master odbc and build feature branch odbc to find out if the problems are on my side. Please give me a week for that...

@Odraio
Copy link
Author

Odraio commented Nov 2, 2022

@aryoda I've created a new branch which contains the query_timeout option, updated documentation (README) and (SQLServer) tests: feature/query_timeout_option

@aryoda
Copy link
Contributor

aryoda commented Nov 3, 2022

@Odraio I have tested your newest feature branch on a DEV DB based on a real-life PRD system and our unit tests did succeed!

odbc version is 1.3.3.9000

I have also found the problem why my manual tests (reported four comments above) did not work:

I have queried tables that contain at least one nvarchar(max) column which was NOT at the end of the table (so the known work-around is to select all columns except this one or to select all nvarchar(max) columns as the last columns in the select clause).

This is a known bug in the odbc driver, for details about a similar problem see: https://www.nickvasile.com/2020/05/13/error-invalid-descriptor-index/

Now also all my manual tests succeed (= do apply the new timeout argument)!

Really great work!

PS: I did not check the DBI tests due to lack of time and since this will be done anyhow when you open a PR.
I personally consider the DBI tests more being a "compliance" test since no driver/R package combo will achieve 100 % test success (I guess) - I have unpublished work on that (since I do not want to start a public discussion which DBI package/DB combo is "the best"). A simplified (but old) compliance report can be found here: https://github.com/aryoda/R_DBI_compliance_reports

@Odraio Odraio linked a pull request Nov 4, 2022 that will close this issue
@Odraio
Copy link
Author

Odraio commented Nov 4, 2022

@aryoda There seems to be one issue while using the query_timeout and params.
In case params are provided the query_timeout will be ignored in some cases. I'm not sure what is missing...

-> Update 1: Perhaps something is missing in odbc_result.cpp -> odbc_result::bind_list -> nanodbc::execute
-> Update 2: query_timeout is indeed missing. I will add it tomorrow...

@Odraio
Copy link
Author

Odraio commented Nov 5, 2022

A new fix has been pushed which resolves the issue of a query using params and query_timeout. Additional tests have been added as well.

@hadley
Copy link
Member

hadley commented Apr 24, 2023

@Odraio did you want to turn this into a PR? ... Doh, I now see that you did: #517

@hadley hadley added this to the v1.4.0 milestone Apr 24, 2023
@hadley hadley modified the milestones: v1.4.0, v1.5.0 Dec 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants