Search This Blog

SBL-SMI-00140: Internal: The MT Server has been disabled

Applies to:

Siebel Workflow - Version: 7.7.2.10 [18385] to 8.0 [20405] - Release: V7 to V8
Information in this document applies to any platform.
** Checked for relevance on 27 February 2012**

Symptoms

Customer is facing recent and intermittent unavailability of WfProcMgr component installed in one server. Customer has noticed one occurrence this Sunday 02/15 at 10:30 AM and  02/16 3:20 AM. When it occurred they were able to restart the server. Customer is not able to always restart the siebel server because the application does not respond in a timely manner.

Since WfProcMgr seems unavailable for some reason SISNAPI errors in SRBroker logs were found also as a consequence of this component being unavailable it seems that records in S_ESCL_REC are increasing.

The records in S_ESCL_REC seems to be associated to triggers created for assignment of campaigns. No recent changes were made on wf policies or configuration.



In the enterprise log and confirmed that WfProcMgr became unavailable.

From the Enterprise logs the related WfProcMgr logs and this analysis resulted in 3 groups of enterprise +wfprocmgr logs:

In WfProcMgr_58975.3.log  can also find the error SBL-SMI-00140: Internal: The MT Server has been disabled.
Customer can see from WfProcMgr_80330.log that the same instance'1-6E1YM0' was still not able to be resumed even after several hours > SBL-BPR-00124: Cannot resume process instance '1-6E1YM0'. Verify that it does exist and has a 'Waiting', 'Suspended' or 'In Error' status.

A3, B3: For the Workflow Process Manager created by the Siebel Server Scheduler at 2009-02-16 03:19:36 with task id 80973 and exited at 2009-02-16 03:20:37.

Finally here Customer were able to find the error SBL-SMI-00062: Internal: No more process (multithreaded server) slots available in WfProcMgr_80973.log.


Customer checked the memory limits and found that memory limits are not being reached (or reached only 20% below the limit) which indicate that the cause does not seem memory related.

Cause

There were 2 problems associated to this workflow component crash;hang:

1. Associated to the jobs executed by workflow process SMCC - List Import is consuming too many threads. At some point max tasks limit is reached and as a consequence no more threads can be created to continue trying to process these records. This would explain why we saw the error SBL-BPR-00124: Cannot resume process instance '1-6E1YM0'.

2. After this problem was solved it seems that a crash was found. This crash was associated to the workflow process "SMCC Sync PEM Status - Contact Phone Status" fails.
The error seems to happen when the method "'SyncPEMStatus" updates a business component field with a picklist with a value that is not available in this picklist. The pick lists name is "Response Outcome PickList".


Solution


1. By stopping the instances in loop from the workflow instance view have helped to free up some threads and avoid the Max Tasks limit to be reached. This means that this job that is processing List Import records must be reviewed in order to ensure that it will not keep in loop and allocating a high number of threads until the limit is reached.

2. The suggestion here is to ensure that the value being used (by workflow process "SMCC Sync PEM Status - Contact Phone Status") to update the field (by the method "'SyncPEMStatus") exists in the picklist associated to this field ("Response Outcome PickList"). 
 
 
 
 

Applies to:

Siebel System Software - Version: 7.8.2.3 SIA [19221] and later   [Release: V7 and later ]
z*OBSOLETE: Microsoft Windows Server 2003
Product Release: V7 (Enterprise)
Version: 7.8.2.3 [19221] Fin Svcs
Database: Oracle 9.2.0.6
Application Server OS: Microsoft Windows 2003 Server SP1
Database Server OS: IBM AIX 5L 5.2

This document was previously published as Siebel SR 38-3208039791.

Symptoms

SBL-SMI-00140 Our Production system "stopped" processing today at 12:07 AM. There were 8 active PIMSI engine tasks processing, but did not completed for 8 hours.

One PIMSI Engine log showed the following:
GenericLog    GenericError    1    0    2006-11-24 11:56:18    (smimtreq.cpp (457) err=2100140 sys=0) SBL-SMI-00140: Internal: The MT Server has been disabled.

We stopped and started our PIMSI processes and after a Recovery, the SSSE appears to be working and completing all 640 sync enabled users.

I searched support web but did not find anything relating to the error. Could you please:

a) describe the error
b) what is the appropriate action to take
c) is there any document where we can find all of the many errors/warnings that come from SSSE and details of what to do about the same?


Cause

Documentation enhancement

Solution

Message 1

For the benefit of other users:

Customer was looking for an SSSE document where they can find all of the many errors/warnings that come from SSSE and details of what to do about the same.

Currently there is no such document available either on the Supportweb or such section in the Siebel Server Sync Guide.

Although SSSE is a fairly new product offering, if there is a documentation with errors/warnings and troubleshooting steps, that would facilitate various end users like administrators and developers to support SSSE in a Production environment easily.

Please note that a Change Request, CR#: 10515531: “SSSE documentation with errors/warnings and appropriate actions to take (troubleshooting)” has been logged for documentation enhancement for a SSSE document with errors/warnings and troubleshooting steps.

Thank you.
 
 

Applies to:

Siebel CTI - Version 7.7.2.6 [18372] and later
Information in this document applies to any platform.
Upon upgrading of Cisco CTI Driver we started experiencing higher than normal task counts for Siebel CSM
processes. However since the upgrade on Friday we have had 5 of our 20 CSM have task counts approach the limit that we are configured for which is 25.
Today they have seen the issue on 6 different Comm Servers and multiple times on one of them.

Issue Impact
If these hung pids are not killed and they reach their max tasks of 25 the Comm Server Manager will crash. It has happened on 6 different servers today. The customer is considering backing out the upgrade.

Snippet from scomm logs
5:13:34 PM 100289106_28719[03/26/2011 13:28:03]:INFO:Trust Client Hostname (RMDTV2594)
100289106_28719[03/26/2011 13:28:03]:INFO:Client login from IP(10.137.157.43), Host(RMDTV2594)
100289106_28719[03/26/2011 13:28:04]:INFO:UpdateAgentActiveInfo ok, CfgID:1-9OAW39J, TelesetID:1-EML18SW
100289106_28719[03/26/2011 13:32:40]:INFO:ResetAgentActiveInfo ok
100289106_28719[03/26/2011 13:32:48]:INFO:UpdateAgentActiveInfo ok, CfgID:1-9OAW39J, TelesetID:1-EML18SW
100289106_28719[03/26/2011 13:42:53]:FATAL:SRM request time out in 600 seconds. No response from Comm. Session Manager
100289106_28719[03/26/2011 13:42:53]:FATAL:Failed to submit SRM request(InvokeCommand) to server, or Comm. Session Manager failed on processing request, ccfErrCode(1300001, input-args={

CSM Logs ' GenericLog GenericError 1 0 2011-03-26 13:45:07 (smimtreq.cpp (457) err=2100140 sys=0) SBL-SMI-00140: Internal: The MT Server has been disabled
GenericLog GenericError 1 0 2011-03-26 13:45:13 (smimtreq.cpp (457) err=2100140 sys=0) SBL-SMI-00140: Internal: The MT Server has been disabled


Symptoms



On : 7.7.2.6 [18372] version, CTI

When attempting to use CTI functionality
the following error occurs.

ERROR
-----------------------
100289106_28719[03/26/2011 13:42:53]:FATAL:SRM request time out in 600 seconds. No response from Comm. Session Manager
100289106_28719[03/26/2011 13:42:53]:FATAL:Failed to submit SRM request(InvokeCommand) to server, or Comm. Session Manager failed on processing request, ccfErrCode(1300001, input-args={


Changes

none

Cause

version 7.5 ICM (latest version of the driver) on older version of Siebel 7.7.x

Solution

Driver provider Cisco was involved in this to look into supportability of the version 7.5 ICM (since it is their latest version) on older version of Siebel 7.7.x as there are CSM hangs using this particular combination only. Cisco was going to perform a round of tests and provide the solution.

Note : Since ICM 6 is working fine customer was staying with that version for the timebeing.

Applies to:

Siebel CTI - Version: 7.7.2.6 [18372] to 8.1.1 [21112] - Release: V7 to V8
z*OBSOLETE: Microsoft Windows 2000
This document was previously published as Siebel SR 38-3198235381.

Symptoms

Customer system is based on multi Communication Session Manager (CSM) components running on the same server.
Agents assigned to a given CSM are being logged out and try to login back are connected to a different CSM than the one defined in CommSessionMgr parameter of the Communications Configuration.

Here the log fragments verified by customer:

GenericLog GenericError 1 0 2006-11-09 08:56:48 (smimtreq.cpp (457) err=2100140 sys=0) SBL-SMI-00140: Internal: The MT Server has been disabled

This started around 8am yesterday on one component and today around 8am today on the other.

Agent logins to Siebel and CommSessionMgr that uses I3 CTI Driver then started failing because the object manager apparently to start using the wrong comm session manager to attempt to login to I3 CTI middleware.

Example agent1 is configured to use CommSessionMgrOne. When agent1 logged into Siebel, then tried to log into the CTI toolbar and login failed. The i3 logs indicate that the agents login request was coming from CommSessionMgr, not CommSessionMgrOne component. And since CommSessionMgr uses a different cti server than CommSessionMgrOne, the logins failed.

Could you please explain what would all the sudden cause the error above and what do to to mitigate it?

Cause

Customer had an implementation on I3 with several Communication Session Manager component placed on the same server that included the Siebel Gateway Name Server.

Agents are organized in separate CSM components: CommSessionMgr and CommSessionMgrOne

From the information send we could analyze the log fragment and found an entry for the issue.
Considering the fragment form file:

DS020_SComm_TMARSHA2.txt.log

...
TMARSHA2_40769[12/01/2006 11:35:51:421]:FATAL:Failed to establish session to GateWay(siebelgateway), Enterprise(siebel), Server(cridds020), Component(CommSessionMgrOne), ccfErrCode(2100140)
TMARSHA2_40769[12/01/2006 11:35:51:437]:DEBUG:About to connect to Server Request Broker at GateWay(siebelgateway), Enterprise(siebel), Server(cridds020) with user(TMARSHA2), key(ea4|457059e7|0)
TMARSHA2_40769[12/01/2006 11:35:51:468]:INFO:Successfully established session to Backup Server at GateWay(siebelgateway), Enterprise(siebel), Server(cridds020), Component(CommSessionMgr)


Solution

By default the Siebel Gateway Name Server in the absence of the primary CSM CommSessionMgrOne for example if the component is down or not available it will try to connect the Object Manager session to default CSM that is: CommSessionMgr

This explains why the agent got connected to the other CSM component/instance.

 
 


Applies to:

Siebel
Information in this document applies to any platform.

Purpose

This document describes how to troubleshoot the behavior where the Communications Session Manager (CommSessionMgr component alias) component process hangs/freeze.

When this type of behavior occurs the CommSessionMgr component tasks in running status will start to increase. You can verify that in Site Map>Administration – Server Management > Components> select Communications Session Manager and verify column Running Normal Tasks.
The CommSessionMgr component tasks correspond to a CTI Command or Event and should be processed in less than 1 second, so during normal operations you should see 0 or 1 or 2 tasks running even for very large deployments. If you see several tasks running and just increasing the CommSessionMgr hang or freeze is identified.

This behavior could also generate error messages in Siebel log files similar to bellow:

In CommSessionMgr component log files message like:
  • “SBL-SMI-00140: Internal: The MT Server has been disabled”, when CommSessionMgr component reaches MaxTasks component parameter.
In SComm_<used id>.log:
  • “FATAL:SRM request time out in 600 seconds. No response from Comm. Session Manager”, indicating that the Siebel Object Manager made a request to the CommSessionMgr component but it did not respond in 600 seconds (default value for CommReqTimeout component parameter)
For the end user a Communications Session Manager hang has high impact in the whole Siebel Enterprise operation, unlike a CTI Toolbar hang described in Note 477899.1 or Internet Explorer hang described in Note 477510.1 that usually are very sporadically affecting few users at a time.
A Communications Session Manager hang will affect several users at the same time or the whole Enterprise with high impact in the production environment. For the end user usually this cause users not able to use the CTI Toolbar (CTI Toolbar buttons disabled), Siebel Application hang for several minutes when trying to login CTI users, Siebel Application freeze when trying to use CTI Toolbar, High CPU in Siebel Servers.

Troubleshooting Steps

Cause


The CommSessionMgr component acts like a gateway between the Siebel Application Object Manager component and the CTI Driver. The Application Object Manager sends the CTI commands/request to CommSessionMgr component that forward it to the CTI Driver to be processed. In case of CTI events the CTI Driver will received the CTI events from the Switch/CTI Middleware Server and send it to CommSessionMgr component that will forward the event to the Application Object Manager to be processed. There is not processing of CTI Commands or Events by the Communications Session Manager component itself. In the majority of the scenario the root cause for a CSM hang is the CTI Driver stop responding or Siebel Application Object Manager or Service Request Broker component stop responding to CommSessionMgr.

The CommSessionMgr component loads the CTI driver library that usually is provided by a third party CTI Middleware vendor (Cisco, Avaya, I3, Syntellect, Nortel...). In the majority of the CommSessionManager hang scenarios the CTI Driver stop responding causing the CommSessionMgr tasks running for more than 1 second.

One way to verify if the root cause is related to the CTI Driver or Siebel Application Object Manager/SRBroker component freeze is to verify in a Siebel Server with multiple instances of CommSessionMgr component if ALL instances hang at the same time or just one instance at a time on a given Siebel Server. If just one or not all CommSessionMgr component instances are affected most likely the CTI Driver stopped responding. In case ALL CommSessionMgr instances show tasks running increasing at the same time most likely an Application Object Manager or SRBroker hang is affecting the CommSessionMgr component. For Application Object Manager hang scenario you can also refer to Note 478050.1.

Troubleshooting

For Communications Session Manager component hang is highly recommended verifying with the CTI Driver Provider if you are running the CTI Driver with the latest patches available for the current version or if there are upgrade recommendations to a newer version of the CTI Driver. There are several calls scenario and CTI Driver parameters setttings detected by the CTI Driver providers that could lead to CTI Driver stop responding and thathas been fixed though patches to the CTI Driver or new version of the CTI Driver(new libraries).

There are no know issues related to Communications Session Manager itself reported causing the component hang.

How to recover and improve stability for CommSessionManager hang scenarios:

• The recommended approach is to manually kill the Communications Session Manager process hanging to immediately avoid new CTI operation been redirect to the hang CSM instance. You can identify the CommSessionMgr process ID from Site Map>Administration – Server Management>Component>select Communications Session Manager>Tasks and verify the PID for the tasks Running. The PID is the siebmtshmw process that needs to be killed in Windows Task Manager or "kill -9 <PID>" command in Unix. If component advanced parameter Auto Restart is set to True (default value), a new CommSessionMgr instance will be launch. Trying to shutdown the component may not respond since the CSM instance is on hang status and this could shutdown health CSM instances too.
• To improve stability is high recommended to launch more than one CommSessionMgr component instance per Siebel Server. In most of the CSM hang scenario only one instance is affected at a time and this way it would only affect CTI users connected or trying to connect to the hang CommSessionMgr instance. Please refer to Note 477606.1 How Do You Use the MaxMTServers and MinMTServers Parameters To Improve Stability of the Communications Session Manager and To Manage Multiple CommSessionMgr Processes?.

Gathering Information for Technical Support

If you have verified that you are running the CTI Driver with the latest patches and still observer the issue, log a new Service Request with Siebel Technical Support, supplying as much of the following information as possible:
• Your current Communications Configuration file (.def). You can export that on Site Map>Communications Administration>All Configurations>select your current configuration and click Export Configuration so I can review all your Communications Configuration parameters, Profiles, Commands and Events.
• A copy of your siebns.dat file located at Siebel Gateway Root Folder\Admin (Windows) and \sys for Unix.
• For one incident the following matching information:
o A list of CSM tasks running, you can generated that using the Siebel Server Manager Command Line tool with the commands:
spool CSMTasksRunning.txt
list tasks for comp CommSessionMgr
spool off
o Enable the event trace log level Server Requests Routing Info (SRMRouting event trace alias) for CommSessionMgr component to level 4 and set the parameter “Use Shared Log File” to False. Provide all CommSessionMgr component log files from the Siebel Server

o Enable the SComm.log debug mode as per Note 476563.1 How Can Users Activate Debug Level of Log Messages When Using CTI Functionality? and provide ALL SComm*.log from the Siebel Server having issues. If the CommSessionMgr component is not running on the same Siebel Server that Application Object Manager is running, provide ALL SComm.log from the Siebel Servers running the Object Manager component too.
o The CTI Driver log files if available, please refer to CTI Driver manual and CTI Driver provided Technical Support to verify how to enable debug tracing on CTI Driver log files.
o Siebel Enterprise Server log covering the component crash, the Siebel Enterprise Server log is located on Siebel Server log folder with the name <Siebel Enterprise Name>.<Siebel Server Name>.log.
o Notice that is very important to collect this information covering the first CommSessionMgr task in running status for more than couple of seconds, since the first task indicates the time stamp for when the hang started. As the SComm.log store up to 1MB of information by default that usually keep couple of hours you must collect the log files as soon as task running is detected.

No comments:

Post a Comment