5.1.5.Locking

Virtuoso offers a dynamic locking strategy that combines the high resolution of row-level locking with the performance of page locking for large transactions.

Isolation Levels

Virtuoso has a full range of isolation options, ranging from dirty read to serializable . The default isolation is repeatable read , which is adequate for most practical applications.

Isolation is set at the connection, i.e. transaction, level. Variously isolated transactions may coexist and each will behave consistently with its semantic.

Repeatable read and serializable transactions are susceptible at any time to termination by deadlock, SQL state 40001. Other transactions are susceptible to deadlock if they own locks as a result of insert, update or delete. Deadlocks are resolved in favor of the older of the contending transactions. A transaction's age is the count of reads performed + 2 * the count of rows inserted, deleted or updated.

Any transaction that has modified the database may be rolled back; all transactions maintain a rollback log. This is a memory-based data structure that contains the state of changed rows as they were before the transaction first affected them. This leads to potential transient memory consumption. All transactions that have changed the database also have a roll-forward log, used to recreate the effects of the transaction during roll-forward recovery.

Read Uncommitted

This corresponds to SQL_TXN_READ_UNCOMMITTED. A read is never prevented by locking, nor do read rows stay locked. The data being read may or may not be committed, hence there is no guarantee of transaction integrity.

Read Committed

Historical Read Committed

Starting with release 5.0, Virtuoso has a non-locking, versioned read committed transaction mode. This is similar to Oracle's default isolation.

If a locked row is read without FOR UPDATE being specified and another transaction owns the lock, the reading transaction will see the row in the state it had before being modified by the transaction owning the lock. There will be no wait. If a row has been inserted but the insert not committed, the row will not be seen by the read committed transaction. If a row has been updated or deleted, the row will be seen as it was before the uncommitted modifying transaction.

If a row is read in read committed mode with FOR UPDATE specified or as part of a searched update or delete statement, the read committed transaction will wait for a locked row and will set an exclusive lock on the row if the row matches the search criteria. This exclusive lock will be held until the read committed transaction terminates.

Hence, if FOR UPDATE is specified, a read committed transaction will have repeatable read semantics, otherwise it guarantees no repeatable read but does guarantee that uncommitted data are never seen.

To make this the default mode, set DefaultIsolation in the Parameters section of virtuoso.ini to 2.

Row-by-Row Autocommit

This transaction mode causes all DML statements to commit after every modified row. This is useful for single user situations where one does large batch updates on tables. For example, an update of every row of a multi gigabyte table would be likely to run out of rollback space before completing. In practice, one can end up in a thrashing situation where a large transaction is in progress, is interrupted by a checkpoint which must temporarily roll back the changed pages, then again resume the transaction etc., leading to serious server unavailability. Note that normally the ini parameter TransactionAfterImageLimit places a cap on transaction size, catching situations of this type before they lead to thrashing.

The row by row autocommit mode prevents this from happening by committing each updated, inserted or deleted row as soon as all the indices of the row are updated. This mode will still maintain basic row integrity, i.e. if the row's data is in one index, it will be in all indices.

This mode is good for any batch operations where concurrent updates are not expected or are not an issue. Examples include bulk loading of data, materialization of RDF inferred data etc.

This mode is enabled with the log_enable function. If the bit of 2's is set in the argument, row-by-row autocommit is enabled and the setting will persist until modified with log_enable or the calling connection is disconnected or the calling web request terminates. Thus, an argument of 2 enables row-by-row autocommit and disables logging. An argument of 3 enables row-by-row autocommit and enables logging. This will cause every updated row to be logged in the transaction log after it is updated, which is not very efficient.

Since transaction-by-transaction recovery is generally not an issue in batch updates, a value of 2 is usually better. If the server is killed during the batch operation, it may simply be restarted and the operation redone. Losing the first half through no logging will not be an issue since the operation will anyway have to be redone.

There is a slight penalty to row-by-row autocommit in comparison with making updates in larger batches but this is under 10%.

Repeatable Read

The transaction will wait for access to exclusively locked rows and will lock all rows it reads. The locking of read rows can be shared or exclusive depending on the FOR UPDATE clause in the SELECT or the SQL_CONCURRENCY statement option. In the case of a select over a range of rows where not all rows match selecting criteria, only matching rows are locked. This mode guarantees that any row locked by the reading transaction can be re-read on the basis of its identity (primary key) and will not have been changed by any other transaction while the locking transaction is in progress. This mode does not prevent another transaction from inserting new rows (phantoms) between rows locked by the original transaction.

Serializable

This mode guarantees that concurrent transactions will look as if the next transaction started only after the previous terminated. This is like repeatable read but prevents phantoms. Space found to be empty in one read will remain empty in the next read while the transaction is ongoing.

Serializable isolation is implemented by locking all ranges of rows matching criteria pertaining to the ordering index in a select. The range here includes the last row before the first in the range. An empty range can be locked by locking the row before the range by a special follow lock, which prevents insertions to the right of the locked row. A by-product of this is that serializable locking guarantees that a select count will give the same result repeatedly unless the transaction itself affect the rows counted.

Serializable isolation is slower than repeatable read and not required by most applications.

All insert, delete and update operations make an exclusive row lock on the rows they operate on, regardless of specified isolation.

Lock Extent

If a transaction is the exclusive owner of locks on a database page and a sufficient percentage of the rows are locked, it makes sense to replace distinct row locks with a single page lock. The LOCK_ESCALATION_PCT parameter controls the threshold for doing this. See the SET statement for details.

If a cursor reads data serially and has a history of locking a high percentage of rows on each page it traverses, it will start setting page level locks as its first choice. It will do this when entering a new page where there are no row-level locks.

Transaction Size

There is no limit in Virtuoso to the transaction size, though the underlying software or hardware may impose limits. Memory consumed by a transaction is proportional to its number of locks held and number of changed rows (insert, update, delete). BKLOBs manipulated by a transaction do not contribute to memory consumption, because they are always disk-based.