I encountered the following error message related to Log Reader job generated as part of transactional replication setup on publisher. As a result of this error, none of the transactions propagated from publisher to any of its subscribers.
Error Message
2008-02-12 13:06:57.765 Status: 4, code: 22043, text: 'The Log Reader Agent is scanning the transaction log for commands to be replicated. Approximately 24500000 log records have been scanned in pass # 1, 68847 of which were marked for replication, elapsed time 66018 (ms).'.
2008-02-12 13:06:57.843 Status: 0, code: 20011, text: 'The process could not execute 'sp_replcmds' on ServerName.'.
2008-02-12 13:06:57.843 Status: 0, code: 18805, text: 'The Log Reader Agent failed to construct a replicated command from log sequence number (LSN) {00065e22:0002e3d0:0006}. Back up the publication database and contact Customer Support Services.'.
2008-02-12 13:06:57.843 Status: 0, code: 22037, text: 'The process could not execute 'sp_replcmds' on 'ServerName'.'.
Replication agent job kept trying after specified intervals and kept failing with that message.
Investigation
I could clearly see there were transactions waiting to be delilvered to subscribers from the followings:
SELECT
*FROM dbo.MSrepl_transactions -- 1162SELECT
*FROM dbo.MSrepl_commands -- 821922
The following steps were taken to further investigate the problem. They further confirmed how transactions were in queue waiting to be delivered to distribution database
-- Returns the commands for transactions marked for replication
EXEC
sp_replcmds
-- Returns a result set of all the transactions in the publication database transaction log that are marked for replication but have not been marked as distributed.
EXEC
sp_repltrans
-- Returns the commands for transactions marked for replication in readable format
EXEC
sp_replshowcmdsResolution
Taking a backup as suggested in message wouldn't resolve the issue. None of the commands retrieved from sp_browserreplcmds with mentioned LSN in message had no syntactic problems either.
exec
sp_browsereplcmds @xact_seqno_start ='0x00065e220002e3d00006'
In a desperate attempt to resolve the problem, I decided to drop all subscriptions. To my surprise Log Reader kept failing with same error again. I thought having no subscription for publications log reader agent would have no reason to scan publisher's transaction log. But obviously I was wrong. Even adding new log reader using sp_addLogreader_agent after deleting the old one would not be any help. Restart of server couldn't do much good either.
EXEC
sp_addlogreader_agent@job_login
='LoginName',@job_password
='Password',@publisher_security_mode
= 1;
When nothing else worked for me, I decided to give it a try to the following procedures reserved for troubleshooting replication
--Updates the record that identifies the last distributed transaction of the server
EXEC
sp_repldone @xactid =NULL, @xact_segno =NULL, @numtrans = 0, @time = 0, @reset = 1
-- Flushes the article cache
EXEC
sp_replflush
Bingo !
Log reader agent managed to start successfully this time. I wish if I could have used both commands before I decided to drop subscriptions. It would have saved me considerable effort and time spent re-doing subscriptions.
Question
Even though I managed to resolve the error and have replication funtioning again but I think there might have been some better solution and I would appreciate if you could provide me some feedback and propose your approach to resolve the problem.