Friday, March 30, 2012

More on mysterious data loss in mssql2k - a possible solution?

Hi All,

Further to my previous long-winded question about a situation in which
we appear to be mysteriously losing data from our mssql2k server.

We discovered an update statement, in the stored procedure we believe
is at fault, after which no error check was being performed.

Under certain conditions, this update is fired against the same record
in the same table as the immediately preceding update statement within
the transaction. We are now suspecting that under some circumstances,
these two updates get into a locking conflict that is eventually
forcing the transaction to be rolled back.

However, I'm still left with three questions.

1) Where an update in a transaction gets locked, and an error isn't
tested immediately afterwards (ie no 'IF @.@.Error<>0' test is made),
would the transaction proceed as normal?

2) Most critically, would statements in the stored procedure that
appear after the COMMIT TRAN statement also be executed, even if an
unresolved lock existed within the transaction?

3) Assuming that (2) does happen, would a SELECT made on another
connection with a 'WITH(NOLOCK)' locking hint be able to see the
changes made in the locked transaction even if the server is set to
READ COMMITTED, and the SELECT takes place some time after the COMMIT
TRAN is issued? More to the point, given (2), how long would the
locked transaction survive before being rolled back after the COMMIT
TRAN has been issued? Is it possible that the COMMIT TRAN takes place,
the transaction is flagged for potential rollback while a lock
resolution is attempted, the stored procedure exists as though
everything was fine, a subsequent SELECT (ie performed as one of the
next operations in the same application) using WITH(NOLOCK) 'sees' the
changes made by the transaction, reinforcing the impression that the
transaction succeeded, and then at some point thereafter the lock is
determined to be unresolvable and the transaction is rolled back,
making it seem as though the data disappeared, even though it had been
SELECTable via a different connection to the server?

Thanks, by the way, to Simon and Erland for your advice on my previous
questions about this problem.

Much warmth,

M WellsM Wells (planetthoughtful@.gmail.com) writes:
> We discovered an update statement, in the stored procedure we believe
> is at fault, after which no error check was being performed.

Short recap: when an error happens in a stored prcoedures, one three
things can happen depending on the error:

1) The statement is terminated, and @.@.error is set. The transaction is
not rolled back.
2) The execution of the stored procedure is terminated and @.@.error is
set. No rollback occurs.
3) The entire batch is aborted. Transactions is rolled back.

More details on http://www.sommarskog.se/error-handling-I.html.

> Under certain conditions, this update is fired against the same record
> in the same table as the immediately preceding update statement within
> the transaction. We are now suspecting that under some circumstances,
> these two updates get into a locking conflict that is eventually
> forcing the transaction to be rolled back.

No, unless you engage in wierd arrangments with linked servers that
results in loopback connections, two statements in the same stored
procedure cannot get in conflicts with each other. What happens if
you have:

BEGIN TRANSACTION

UPDATE tbl1 ...

UPDATE tbl2 ...

UPDATE tbl3 ...

COMMIT TRANSACTION

and the UPDATE on tbl2 fails with a statement-terminating error (for
instance PK violation, NOT NULL violation), and there is no error-handling,
is that the effect of the updates on tbl1 and tbl3 will be persisted
when the transaction is committed.

> 1) Where an update in a transaction gets locked, and an error isn't
> tested immediately afterwards (ie no 'IF @.@.Error<>0' test is made),
> would the transaction proceed as normal?

Yes. (But you can't really say that an "update gets locked".)

> 2) Most critically, would statements in the stored procedure that
> appear after the COMMIT TRAN statement also be executed, even if an
> unresolved lock existed within the transaction?

Yes. (Save again that there are no "unresolved locks". All locks are
released as the transaction commits.)

> 3) Assuming that (2) does happen, would a SELECT made on another
> connection with a 'WITH(NOLOCK)' locking hint be able to see the
> changes made in the locked transaction even if the server is set to
> READ COMMITTED, and the SELECT takes place some time after the COMMIT
> TRAN is issued?

If you read data WITH (NOLOCK), you may be reading dirty data. It doesn't
matter what settings you have elsewhere. Query hints wins over anything
else. If you confirmation page reads with NOLOCK, you absolutely must
change that. Else you are not confirming data.

> More to the point, given (2), how long would the
> locked transaction survive before being rolled back after the COMMIT
> TRAN has been issued?

Assuming that there are no nested transactions, therre is no transaction
and no locks around after COMMIT.

But if the COMMIT for some reason is not executed, for instance because
the query is cancelled, or the error terminates the stored procedure,
so that the transaction lives on, then the transaction will continue to
live until the process disconnects, at which points a fat rollback will
set in. (Actually with connection pooling it's takes a little while more
before it happens.)

> Is it possible that the COMMIT TRAN takes place,
> the transaction is flagged for potential rollback while a lock
> resolution is attempted, the stored procedure exists as though
> everything was fine, a subsequent SELECT (ie performed as one of the
> next operations in the same application) using WITH(NOLOCK) 'sees' the
> changes made by the transaction, reinforcing the impression that the
> transaction succeeded, and then at some point thereafter the lock is
> determined to be unresolvable and the transaction is rolled back,

Again, there is no lock to resolve, nor that can be deemd to be
unresolved. I'm sorry to nag this point.

Judging from your description, you have two problems: 1) You read
with NOLOCK. 2) You fail to commit your transaction in some situations.
The missing @.@.error check is probably not the main problem about the false
confirmations. More likely you fail to handle procedure-aborting errors or
command timeouts, since these are the cases where you most easily can
go wrong about this.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp

No comments:

Post a Comment