Challenges


What is best way to problem handling?

As a Middleware Admin there are many troubles in daily jobs, whenever a problem occurs, You must understand the problem, identify that is it a known or unknown. If it is known issue there could be workaround for that it can be done with knowledge management. As part of best practices you must maintain a issues and their resolutions as Knowledge Management Database KMDB. 

Problem can be understood with the following simple Questions that make you clarity about which area, what could be the reason for the issue.
  • What do we know?
  • What do we need to know?
  • What should we do?
Remember that every problem there must be solution or possible alternative solutions. You need to have enough knowledge to resolve the issue with wisdom. Wisdom gives you the right path to reach right decision to solve the issue.

Problem solving strategy: Problem || Analysis Data -> Information  -> Knowledge -> Wisdom || Solution

Problem vs Issue

Problem that happen accidentally due to any reasons, such as network not in sync, Database server might overloaded or anything. An issue can be transformed into problem when it is not identified in time and not resolved. A problem that need to fix immediately. The impact of the problem can be Life threatening. To make more simple, fever is an issue it can be resolved by pills, Heart Attack is a major Life threatening it is not solved immediately there could be huge loss.

To explore on middleware if you encounter more troubles more you learn. At the same time remember that your Client always wishes that business will run uninterruptedly. You give the 99.9999%  

Bottom line, When the problem is not in Middleware environment then don't play the name blame game. Try to give your best for support the solution providing team. 

Love issues, which make you to learn more about the subject. Enjoy with issues, troubles, you will be enlighted with your problem skills so.......... dear smart WLA rock with them!!!

1. Port Conflict Issue:

While configuring a new WebLogic instance and starting it, that might be get an issue like : "Port already in use". There could be many reasons for this one.

1. on the same machine multiple standalone instances might be running one of the instance already used that port which you have given for new configuration. 

2. apache might be running with the same port.

3. middleware might be running on the same machine with same port 

On Solaris Operating environment we have 2 options:

1. using pfiles command

netstat –na|grep <port> --> identify port in use  

pfiles <pid>|grep -i sockname |grep port --> look for every java process is initialized by startWebLogic.sh or startManagedWebLogic.sh


2. Another way costly one (Third party package) to find the process that is using particular port is :

lsof -i tcp:<Port_number>

This Linux command lsof works for that user to where this command is executed by the user.

3. Best way is perl script using a  method it will check only standard ports which are used by the system.

getservbyport(int port_number, const char *protocol_name)

Sample perl script goes as follows:

#!/usr/bin/perl
($name, $aliases, $port_number, $protocol_name) = getservbyport(7001, "tcp");
print "Name = $name\n";
print "Aliases = $aliases\n";
print "Port Number = $port_number\n";
print "Protocol Name = $protocol_name\n";

2.  Config.xml Repository Issue

When I copy pasted few of the JMS Queue related configurations from existing WLS config.xml to newly created WLS Domain's config.xml. I got stuck the control at following line in the WebLogic Server logs.

<Jul 28, 2008 12:08:26 PM EDT> <Notice> <Management> <BEA-140005> <Loading domain configuration from configuration repository at /path/

Sol: Verify all the lines in the config.xml file there might be missing any xml tags missing. I have missed out one line need to correct it now it is working fine.

3. Multicast Failure

while running the managed server on a remote server getting the following exception:

<Jul 31, 2008 4:16:16 AM EDT> <Error> <Cluster> <BEA-000109> <An error occurred while sending multicast message: java.net.SocketException: Socket is closed

java.net.SocketException: Socket is closed

        at java.net.DatagramSocket.send(DatagramSocket.java:577)


Sol:

1.  check for the ping happening between the cluster involved machines.

2.  Verify the Multicast IP with the MulticastTest with multiple ips and with ports then use which is communicating the message between two participants.

AppMachine 1: 

$ java utils.MulticastTest -n  app1  -a 237.0.0.5 -p 9999

 

WebMachine 2: 

$ java utils.MulticastTest -n  app2  -a 237.0.0.5 -p 9999


The messages must exchange between them... otherwise check with Network team.

3. Multcast heart beat can be visible by the following command line:

java weblogic.cluster.MulticastMonitor <multicastIP> <port> mydomain mycluster


4. Application not accessable

While accessing the application url after Web App deployment.

<Aug 19, 2008 10:58:07 AM EDT> <Error> <HTTP> <BEA-101017> <[ServletContext(id=29754737,name=webapp1,context-path=/contextpath)] Root cause of ServletException.
java.lang.StackOverflowError

 Sol: Check the classpath it is the communication gap between Servlet and an EJB

5. Editing the WLS password

New password is given in boot.properties when server is in offline.

 

The Admin server started but while starting the managed instances got the following error:

<Aug 22, 2008 8:21:54 AM EDT> <Critical> <Security> <BEA-090402> <Authentication denied: Boot identity not valid; The user name and/or password from the boot identity file (boot.properties) is not valid. The boot identity may have been changed since the boot identity file was created. Please edit and update the boot identity file with the proper values of username and password. The first time the updated boot identity file is used to start the server, these new values are encrypted.>
********************************************************
The WebLogic Server did not start up properly.
Reason: weblogic.security.SecurityInitializationException: Authentication denied: Boot identity not valid; The user name and/or password from the boot identity file (boot.properties) is not valid. The boot identity may have been changed since the boot identity file was created. Please edit and update the boot identity file with the proper values of username and password. The first time the updated boot identity file is used to start the server, these new values are encrypted.
************************************************************

Sol: Resolved by editing in boot.properties, startManagedWebLogic.sh files for the new password in 'CLEARTEXT' pattern and rename ldap folder in the Managedserver instance folder present in the domain folder.

While starting up the server instance boot.properties will be encrypted and re-written back. after this ldap folder will be generated in the domain directory for the Managed instance where new file will be created with instance name with extention as .ldif

Apache Plug-in configuration

While configuring the Apache using the instructions given in the following link :

6. Deployment Error

 Module Name: apollo, Error: [HTTP:101179][HTTP] Error occurred while parsing descriptor in Web application "/domainpath/application.war" [Path="/source_code_path", URI="applicaiton.war"
org.xml.sax.SAXParseException: Element type "local-path" must be followed by either attribute specifications, ">" or "/>".
Sol:

Check the Deployment Descriptors (here weblogic.xml) edit as per the given Line.


7. JNDI Issue

While deploying application which is having multiple JMS Queues getting as:

data couldn't be saved. The exception is : Error while doSenderQueueLookUp for :QUEUEJNDIUnable to resolve 'QUEUEJNDI' Resolved

Sol:

Please verify the Persistance Store is connected to Connection Pool.


8.  New managed server Configuration issue

Here I am with set of errors when newly configuring WebLogic 9.2 MP3 managed server and adding to a webcluster. The managed server when started first time shown the following :

1. BEA-000362

2. BEA-000386

On Solaris box:

<Mar 30, 2009 2:56:49 PM EDT> <Critical> <WebLogicServer> <BEA-000362> <Server failed. Reason:
There are 1 nested errors:
java.net.UnknownHostException: mywebhostname.com: myhostname.com
        at java.net.InetAddress.getAllByName0(InetAddress.java:1128)
        at java.net.InetAddress.getAllByName0(InetAddress.java:1098)
        at java.net.InetAddress.getAllByName(InetAddress.java:1061)On

On Linux box:

<Critical> <WebLogicServer> <BEA-000386> <Server subsystem failed. Reason: weblogic.ldap.EmbeddedLDAPException: Unable to open initial replica url: http://myadminhost.com:7001/bea_wls_management_internal2/wl_management
weblogic.ldap.EmbeddedLDAPException: Unable to open initial replica url: http://myadminhost.com:7001/bea_wls_management_internal2/wl_management
        at weblogic.ldap.EmbeddedLDAP.getInitialReplicaFromAdminServer(EmbeddedLDAP.java:1319)
        at weblogic.ldap.EmbeddedLDAP.start(EmbeddedLDAP.java:221)
        at weblogic.t3.srvr.SubsystemRequest.run(SubsystemRequest.java:64)
        at weblogic.work.ExecuteThread.execute(ExecuteThread.java:209)
        at weblogic.work.ExecuteThread.run(ExecuteThread.java:181)
java.io.FileNotFoundException: Response: '500: Internal Server Error' for url: 'http://myadminhost.com:7001/bea_wls_management_internal2/wl_management'
        at weblogic.net.http.HttpURLConnection.getInputStream(HttpURLConnection.java:472)
        at weblogic.ldap.EmbeddedLDAP.getInitialReplicaFromAdminServer(EmbeddedLDAP.java:1296)
        at weblogic.ldap.EmbeddedLDAP.start(EmbeddedLDAP.java:221)
        at weblogic.t3.srvr.SubsystemRequest.run(SubsystemRequest.java:64)
        at weblogic.work.ExecuteThread.execute(ExecuteThread.java:209)
        Truncated. see log file for complete stacktrace

Solution/Work around:

BEA-000362

First thing you need to ensure on your configuration, double check your config.xml file assigned IP or VIP or domain name is correct or not. On the other hand we need to investigate for the network communication happening or not with the hostnames given in the config.xml. ping from managed server box to admin server hosting dns. And also check the /etc/hosts updated with the new host dns. Some times DNS table entries not updated in WAN domains may cause this issue.

BEA-000386

Check for the hostname of the managed server listen-address given one and hosting dns/ip is matching or not using the WebLogic console(this could be a time taking process). 

   Unix aware guy always prefer grep on config.xml :)

 


Session Replication issues on Web-Tier cluster

BEA-101310 

Web server failed to perform batched update for replicated sesions

ohh my god!!!
 

BEA-000117 

Received a stale replication request

Whenever you encounter this error code in your logs you need to check following:

1. Network connectivity between the cluster member hosting machines.

2. Check all the member servers are alive in that cluster.

BEA-090870 

Security issue while starting Admin Server

Exception:
####<Oct 28, 2010 3:38:41 AM PDT> <Error> <Security> <Unknown> <AdminServer> <[STANDBY] ExecuteThread: '1' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <1288262321492> <BEA-090870> <The realm "myrealm" failed to be loaded: weblogic.security.service.SecurityServiceException: java.lang.ExceptionInInitializerError.
weblogic.security.service.SecurityServiceException: java.lang.ExceptionInInitializerError
at weblogic.security.service.CSSWLSDelegateImpl.initializeServiceEngine(CSSWLSDelegateImpl.java:342)
The workaround is suggested as 'Please login to the Database and then run the below SQL file to execute different queries mentioned in this file:
Suppose Database User Login is “weblogic” then login to Database with this credential and run the script under
$BEA_HOME\wlserver_10.3\server\lib\rdbms_security_store_oracle.sql
Then restart your AdminServer'.
 
 

BEA-000337 - Stuck Thread

** You can use wise step using jps -v|grep [instance]

First step you need to take thread dump (kill -3 for Unix, CTRL+Break for Windows) make sure one of thread dump analyzer is opend use any thread dump analyzer.

Check the logs where the STUCK found, Check the timing when it came first time, how much time it exceeded configured time? If that time interval seconds exceeded for more than 10 mins and threadpool doesnot have idle threads then stop that WebLogic server instances. 

WebLogic 9.2 MP3  Managed Server cannot start !!??!!

On Red Hat Linux 2.4 the WebLogic managed Server could not start. Tried many times kill the process and start no luck! memory is in comfortable range. CPU don't have load, Network connectivity is also verified, Good condition. WebLogic server log is not writing about why it is not able to start the server instance. 

Finally checked uptime for that box. It is more than a year ago started, thought that rebooting machine might helps.

Yesss!!, it worked...

BEA-000388 -
JVM called WLS shutdown hook. The server will force shutdown now

Issue:

The WebLogic server got shutdown with the following error stack found in the server standard out log files:

####<Sep 7, 2012 3:14:20 PM IST> <Notice> <WebLogicServer> <Ncorp-PLM-08> <Ncorp-PLM-08-AgileServer> <Thread-1> <<WLS Kernel>> <> <> <1347011060768> <BEA-000388> <JVM called WLS shutdown hook. The server will force shutdown now>

####<Sep 7, 2012 3:14:20 PM IST> <Alert> <WebLogicServer> <Ncorp-PLM-08> <Ncorp-PLM-08-AgileServer> <Thread-1> <<WLS Kernel>> <> <> <1347011060768> <BEA-000396> <Server shutdown has been requested by <WLS Kernel>>

####<Sep 7, 2012 3:14:20 PM IST> <Notice> <WebLogicServer> <Ncorp-PLM-08> <Ncorp-PLM-08-AgileServer> <Thread-1> <<WLS Kernel>> <> <> <1347011060771> <BEA-000365> <Server state changed to FORCE_SUSPENDING>

The following could be possible Cause and we have the workaround solution as:

I have the following suggestion for you for the below error:

<Sep 7, 2012 3:14:20 PM IST> <Notice> <WebLogicServer> <Ncorp-PLM-08> <Ncorp-PLM-08-AgileServer> <Thread-1> <<WLS Kernel>> <> <> <1347011060768> <BEA-000388> <JVM called WLS shutdown hook. The server will force shutdown now>

1) It seems like some component is sending the wrong signal to the JVM and this issue is occurring. JVM monitors and catches OS signals, like: CTRL +C event, Log off event, shutdown event.When JVM catches one of the stated above signals, it shutdowns all java processes.

Please try these possible solutions of this problem:

· Specify -Xrs parameter in the JAVA startup arguments and start the admin server

· -Xrs

· Note: -Xrs is a non-standard option developed by Sun Microsystems for their HotSpot JVM. BEA JRockit continues to support this option; however the BEA JRockit non-standard option -XnoHup provides the same functionality. -Xrs reduces usage of operating-system signals by the JVM. If the JVM is run as a service (for example, the servlet engine for a web server), it can receive CTRL_LOGOFF_EVENT but should not initiate shutdown since the operating system will not actually terminate the process. To avoid possible interference such as this, the -Xrs command-line option does not install a console control handler, implying that it does not watch for or process CTRL_C_EVENT, CTRL_CLOSE_EVENT, CTRL_LOGOFF_EVENT, or CTRL_SHUTDOWN_EVENT Operation.


2) If the issue recurs again even after trying the above option then can you please apply the following JAVA_OPTION:

· Depending on the JVM version, it may be possible to get a thread dump before the process exits

· HotSpot supports the command-line option -XX:+ShowMessageBoxOnError

· The corresponding JRockit option is -Djrockit.waitonerror

· While the JVM< goes down, it may prompt the user: "Do you want to debug the problem?"

· This pauses the JVM, thereby creating an opportunity to generate a thread dump (a stack trace of every thread in the JVM), attach a debugger, or perform some other debugging activity.

BEA-160192

<BEA-160192> <While upgrading weblogic.xml, encountered "max-in-memory-sessions" param "session-param". This param is unknown and will be removed>

This eror you can see in your weblogic logs, Because the deployment descriptor weblogic.xml is not updated with the latest DTD defined for weblogic 11g. 


To resolve this do setWLSEnv.sh and run the following and it will convert it to compatable to WebLogic 11g. But as this is template the basic xml will need to be amended with the 11g DTD and then build script creating the correct file out of it - 

java weblogic.DDConverter -d . <<ear/warfile or webapp/ear directory>>



Courtesy by : Prasanna Yalam, Brahmaiah