Today's Tech Headaches: 2014

Thursday, November 20, 2014

Splunk & Spring Integration

Today's headache is integrating the two!

Me, having limited Spring experience, embarked on a wild journey of bean definitions and namespace resolutions, it got really frustrating before enjoyable...

Referencing http://docs.spring.io/autorepo/docs/spring-integration-splunk/0.5.x-SNAPSHOT/reference/htmlsingle/, it seems easy. Add this and you're set, or are you?


<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xmlns:int="http://www.springframework.org/schema/integration"
 xmlns:int-splunk="http://www.springframework.org/schema/integration/splunk"
 xsi:schemaLocation="http://www.springframework.org/schema/integration/splunk
  http://www.springframework.org/schema/integration/splunk/spring-integration-splunk.xsd
  http://www.springframework.org/schema/integration
  http://www.springframework.org/schema/integration/spring-integration.xsd
  http://www.springframework.org/schema/beans
  http://www.springframework.org/schema/beans/spring-beans.xsd">

</beans>

Well no, because going to http://www.springframework.org/schema/integration/splunk yields nothing. I'd wish Spring mentions that!

You must add this to your dependencies:
http://maven-repository.com/artifact/org.springframework.integration/spring-integration-splunk/1.1.0.RELEASE
Cool so for me, I'm using IntelliJ, and Ivy for my dependency management. Keep in mind IntelliJ Community Edition has no Spring support, thought I'd mention that because my Intellij was complaining about Spring being an "unknown facet". Also, I'm running my application on Jetty.

It should just work, but I continually got this:
Caused by: org.springframework.beans.FatalBeanException: Class [org.springframework.integration.splunk.config.xml.SplunkNamespaceHandler] for namespace [http://www.springframework.org/schema/integration/splunk] does not implement the [org.springframework.beans.factory.xml.NamespaceHandler] interface
What does this even mean? After lots of googling, I was led to believe I had a classloader issue. Do I add an entry to my web.xml, or my project classpath? Do I add my JAR to my WEB-INF/lib folder (worst suggestion ever)? But everything in my IntelliJ lib folder is already on the classpath!! Then I thought, hm maybe the error means something and it's because SplunkNamespaceHandler is extending AbstractIntegrationNamespaceHandler...

Nope, none of the above. The problem is the Splunk JAR (1.1.0) has a dependency on 4.0.2-RELEASE of Spring, whereas my Spring context.xml looked like the below:


<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:context="http://www.springframework.org/schema/context"
       xmlns:task="http://www.springframework.org/schema/task"
       xmlns:util="http://www.springframework.org/schema/util"
       xmlns:int-splunk="http://www.springframework.org/schema/integration/splunk"
       xsi:schemaLocation="http://www.springframework.org/schema/beans
       http://www.springframework.org/schema/beans/spring-beans-3.1.xsd
       http://www.springframework.org/schema/integration/splunk
       http://www.springframework.org/schema/integration/splunk/spring-integration-splunk.xsd
       http://www.springframework.org/schema/context
       http://www.springframework.org/schema/context/spring-context-3.1.xsd
       http://www.springframework.org/schema/task
       http://www.springframework.org/schema/task/spring-task-3.0.xsd http://www.springframework.org/schema/util http://www.springframework.org/schema/util/spring-util.xsd">

Changing all the Ivy dependencies to 4.0.2 and making sure IntelliJ's default Spring libraries aren't being used did the trick.

Also, my perfect girlfriend asked for a shoutout, so here it is! :)

Monday, October 20, 2014

Packer, Vagrant and Windows

Fun day of setting up Packer with configurations my colleague put together :)
Packer is a way for you to build a Vagrant box locally with all the software and configurations you need, without having to transfer an enormous VBox around. Very neato.
On running a simple "packer build ", your OS should build from scratch. First problem I encounter:
More info here
The executable 'bsdtar' Vagrant is trying to run was not
found in the %PATH% variable. This is an error. Please verify
this software is installed and on the path.

What do you expect I'd do? I locate bsdtar.exe in "C:\HashiCorp\Vagrant\embedded\mingw\bin", and add the path to my PATH. Then I get another error:

The box failed to unpackage properly. Please verify that the box
file you're trying to add is not corrupted and try again. The
output from attempting to unpackage (if any):

x Vagrantfile
x box.ovf
x metadata.json
x ubuntu1404-disk1.vmdk: Write failed
Packer/bsdtar.EXE: Error exit delayed from previous errors.

Well that was useless. Long story short, it seems like a bug when upgrading Vagrant from an older version to a newer version. I upgraded from 1.6.3 to 1.6.5. Uninstalled my current and reinstalling Vagrant 1.6.5 fixed the issue.

Getting past the intial setup, we wrote scripts to automate installation of a particular IBM product: Maximo. Here's some of the dependencies:

Install WebSphere Application Server or Oracle WebLogic, ~2-3GB
A database (I used Oracle 11g R2), minimalistically this takes about 5-10GB space
Yum packages (e.g. Ant, Oracle DB pre-reqs)
Open File descriptors, kernel property changes
Running maxinst.sh

Installing Weblogic and Oracle DB doesn't actually taking a long time. Maxinst takes the bulk of the time, and has a tendency to fail. A couple of key notes I took for Packer:

The documentation suggests using a post processor to keep "intermediary artfacts" (the vbox) like so:
```
  "post-processors": [
    {
      "output": "builds/centos65-wwm-base.box",
      "type": "vagrant",
      "keep_input_artifact": true
    }
  ]
```
The trouble is, I still get "Deleting output directory" at the end of a failed build, which means "keep_input_artifact" only works if your build succeeds (I'm guessing, I never tried). Horrible stuff, you're going to automatically delete 3 hours worth of builds with no way for me to keep my vbox? Not happy HashiCorp.
I like to lock my screen while stuff runs in the background. With Packer? Bad idea.

Wednesday, October 8, 2014

SoapUI working with IBM JRE

In short: there is no support from Smartbear to support the IBM JRE, all efforts lead to a response of "use the Sun JRE".
Why would you use the IBM JRE? This is to send JMS messages to WebSphere's SI Bus, where the Application Server has Global Security turned on. You are required to set these 2 JVM properties:
-Dcom.ibm.CORBA.ConfigURL
-Dcom.ibm.SSL.ConfigURL

If you don't do this and attempt to send a message, you get a WsnInitialContext exception.
Once you've configured soapui-pro.sh to use the IBM JRE, you'll find that you won't be able to activate/use your license (even if you'd activated it while using the Sun JRE). You'll go through the process of re-activating your license, but be told you're missing a valid license.
After a day's effort of trying different things, such as moving across Sun's JRE providers into IBM JRE's "java.security" file, I ended up decompiling soapUI's code. It appears soapUI's decryption method is "RSA - SunJCE - 512", which requires the "BouncyCastle" security provider. The solution was to add this line to the JRE's java.security file:
security.provider.1=org.bouncycastle.jce.provider.BouncyCastleProvider
Voila, you can now activate your license. Although...

SoapUI Pro 5.1.2 has a gotcha when running testrunner.sh. It will attempt to validate your license as well, and requires you to have X11 forwarding enabled (no matter what). So if you're like me and are running SoapUI Pro on a headless Linux environment, you're stuffed. We ended up downgrading to 5.0.0, where this X11 port forward is not required.

Monday, August 25, 2014

Maximo startup problems!

Our infrastructure utilizes WebSphere MQ as our Maximo queue backend. Our automation framework (consisting mainly of jython scripts injected into wsadmin) sets up CQIN and SEQIN queues, as well as activation specifications and whatnot, with 1 click of a button, so our margin for error is pretty low once it's off the ground running.
So when this error appeared in SystemOut.log on Maximo startup it was quite discomforting:

[8/25/14 15:53:39:542 EST] 000003d9 SystemOut O 25 Aug 2014 15:53:39:510 [ERROR] [MXServer] [] java.lang.NullPointerException at psdi.iface.jms.JMSContQueueProcessor.processMessage(JMSContQueueProcessor.java:253) at psdi.iface.jms.JMSListenerBean.onMessage(JMSListenerBean.java:203) at com.ibm.ejs.container.WASMessageEndpointHandler.invokeJMSMethod(WASMessageEndpointHandler.java:138) at com.ibm.ws.ejbcontainer.mdb.MessageEndpointHandler.invokeMdbMethod(MessageEndpointHandler.java:1146) at com.ibm.ws.ejbcontainer.mdb.MessageEndpointHandler.invoke(MessageEndpointHandler.java:844) at com.sun.proxy.$Proxy33.onMessage(Unknown Source) at com.ibm.mq.connector.inbound.MessageEndpointWrapper.onMessage(MessageEndpointWrapper.java:131) at com.ibm.mq.jms.MQSession$FacadeMessageListener.onMessage(MQSession.java:125) at com.ibm.msg.client.jms.internal.JmsSessionImpl.run(JmsSessionImpl.java:2747) at com.ibm.mq.jms.MQSession.run(MQSession.java:950) at com.ibm.mq.connector.inbound.ASFWorkImpl.doDelivery(ASFWorkImpl.java:88) at com.ibm.mq.connector.inbound.AbstractWorkImpl.run(AbstractWorkImpl.java:216) at com.ibm.ejs.j2c.work.WorkProxy.run(WorkProxy.java:668) at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1862)

This error was flooding the logs every few milliseconds, and causing CPU starvation!
It seemed to suggest the "JMSContQueue" was trying to "processMessages" (duh). The problem: The WebSphere MQ infrastructure (which we didn't own) did not have the queue yet. For some reason Maximo polls infinitely for the queue...so I changed "intjmsact" to point at a queue which did exist, and voila!

But that wasn't the end of it! When the queue was finally created and "intjmsact" was configured back to point at the original queue, same error message!
This time, the problem was that there were messages already on the queue, which Maximo did not recognize. Maximo picked them up, rejected them, and put them back on the queue, causing yet another infinite cycle. Deleting the messages resolved the issue.

Sunday, August 17, 2014

Splunk and lookups

The client upgraded Splunk from 5.0.8 to 6.1.2, worthwhile upgrade imho. But it messed up my query, possible bug.

Given this query: (not exact for commerical reasons)


index=prod sourcetype=wps.log module="PXY_*" (`transaction_filter`)
  | dedup host _raw
  | eval timestamps=_time
  | convert timeformat="%s" ctime(_time) as TimeStamp
  | search [| inputlookup outages | eval StartTime = strftime(strptime(Start,"%d/%m/%Y, %H:%M"),"%s")
            | eval EndTime = strftime(strptime(End,"%d/%m/%Y, %H:%M"),"%s")
            | eval search = "(TimeStamp < \""+StartTime+"\" OR TimeStamp > \""+EndTime+"\")"
            | fields search | mvcombine search | eval search = "(" + mvjoin(search, " ") + ")"]

I had used this in v5 to filter out results that fell within an outage period. The pre-req for this is a lookup table called 'outages'.

The result of the subsearch looked like this.

((TimeStamp < "1398949200" OR TimeStamp > "1398974400") (TimeStamp < "1399554000" OR TimeStamp > "1399575600") (TimeStamp < "1399726800" OR TimeStamp > "1399748400") (TimeStamp < "1399986000" OR TimeStamp > "1400011200") (TimeStamp < "1400072400" OR TimeStamp > "1400097600") (TimeStamp < "1400418000" OR TimeStamp > "1400443200") (TimeStamp < "1400504400" OR TimeStamp > "1400529600") (TimeStamp < "1400763600" OR TimeStamp > "1400788800") (TimeStamp < "1400763600" OR TimeStamp > "1400778000") (TimeStamp < "1400936400" OR TimeStamp > "1400958000") (TimeStamp < "1401282000" OR TimeStamp > "1401307200") (TimeStamp < "1401454800" OR TimeStamp > "1401516000") (TimeStamp < "1401541200" ))

Before the upgrade, it just worked as it should've. After upgrade, nada. Defect perhaps?

Monday, August 11, 2014

Websphere Messaging Engine not starting

In trying to automate WebSphere installation, we ran into the titled problem.
As with my other posts, we've got corporate DBAs who we engage to create user accounts and databases for us. Our initial guess was the user account we had created for us didn't have the right privileges, but there were no SQL exceptions in FFDCs. When starting the messaging engine, we'd get this error:

The messaging engine "ME_name" cannot be started as there is no runtime initialized for it yet, retry the operation once it has initialized. For the runtime to successfully initialize the hosting server must be started, have its 'SIB service' already enabled, and dynamic configuration reload enabled. If this is a newly configured messaging engine and it is the first messaging engine to be hosted on this server, then it is most likely the 'SIB service' was not previously enabled and thus the server will need to be restarted. The messaging engine runtime might not be initializing because of an error while trying to start, examine the SystemOut.log of the hosting server to check for error messages indicating the problem

The node server SystemOut.log revealed pretty much nothing. The nodeagent had a number of FFDCs. So I thought perhaps it was a firewall problem, was on the right track...
We found:

port 9420 was new to us. We were used to WebSphere v7, and looking through serverindex.xml we noticed a port called Status Update Listener: More info
netstat on the node server and all the ports listening were not matching what we got opened through firewalls. So we changed them.
the FFDCs had an "UnknownHostException: *". The application server wasn't starting properly either, so this error pointed me in the right direction. The host needs to be defined for at least the SOAP_CONNECTOR_ADDRESS, and IPC Connector port we set to localhost
I got the messaging engine running by setting the schema (under Bus > Messaging Engine > Message Store > Schema) and the user to the same value.

Thursday, August 7, 2014

Firewalls in the corporate

Jeez getting a ZIP file to where I needed it today was such a pain! I got WinSCP onto a Windows VM where a copy of Maximo was installed, to SCP the directory structure to a Linux box (Maximo admins: you have to do this because someone decided Maximo could only be installed on Windows). Turns out after a bit of debugging the Windows box wasn't in the same VLAN segment as the Linux boxes. So if this happens to you:
telnet 22 (do this both ways, you get timeout)
WinSCP times out
tracert times out
Turning off iptables on the Linux box does nothing (/etc/init.d/iptables stop)

...then you probably have your "Windows" machine in the wrong place.

Thursday, July 31, 2014

Ansible and Corporate security

Was admittedly poking my nose in my colleagues' troubles, when they were describing setting up Ansible on a Go "CI Server" which required to SSH to our WebSphere Deployment Managers. Cos as you know, Ansible doesn't use a server-agent topology like Puppet does, but instead uses SSH keys. To put it pictorially:

Trouble is, Ansible must SSH as "root", and Go agent must run as "go", meaning you have to SSH as a different user. This means SSH keys stored in the "root" directory to avoid logging in each time, which is a breach of security! This sparked discussions of actually getting a "root" user accessible by us (which no sane security team is gonna allow). It actually makes sense to get sudo access for "go" or whatever we decide on the CI server, but we've yet to work that out.

Sunday, July 27, 2014

Hackintosh setup!

I thought I'd take a minute to 'document' my experience setting up a Hackintosh on a brand new PC, with existing Windows 7 Professional (64-bit). I was a complete noob at this, and coming out of it I'm just bedazzled how much legacy instructions are out there.
Don't do everything that you find Google tells you to do.

I'll highlight VERY CLEARLY what the pre-requisites are (and critique if not clear enough, let's work together brahs):
1) Knowing what type of partition your OS boots from (MBR or EFI). Here's an example:

2) Depending on the above, choosing a bootloader (Clover for UEFI, Chimera for MBR). Don't follow the TonyMac website if you're using EFI, you'll end up needing to format your PC.
3) BACKING UP your EFI files. Nothing frustrated me more than not knowing what partition these were on, or where they were on the Windows CD.
4) Make sure you buy a USB stick. If you accidentally delete your EFI partition, at least you can still boot into Mac with this.
5) A copy of Mac OS. I had an Apple laptop lying around, so I downloaded the OS from the AppStore. Otherwise ask your friends to get it for you.
6) 2 disk drives is preferable, 1 for each OS. I have Windows on my SSD, and Mac on my HDD.
7) Most importantly, get familiar with extending partitions and creating them with fdisk in Mac OS. Loads of Hackintosh setup sites have the commands you'll need (I deleted and recreated partitions at least 5 times before it worked).

My SSD has a 100MB EFI partition. Hackintosh instructions state to set this to 200MB. PAY NO ATTENTION. Just stick with 100MB and don't mess with the sizes of your partitions or they'll start at address spaces you won't want them to. Again I'll say I used a PC with an existing Windows 7 installed.

Saturday, July 26, 2014

Vagrant proxy on Cygwin

Cygwin + Ruby + Vagrant = Pain!

Vagrantfile. Simple. I was given this by a colleague running Vagrant on a Mac.

 Vagrant.configure("2") do |config|  
  config.vm.box = "base"  
  #Ensure proxy plugin is installed - $ vagrant plugin install vagrant-proxyconf  
  config.proxy.http   = "http://XXX"  
  config.proxy.https  = "http://XXX"  
  config.vm.define :box1 do |box1|  
   box1.vm.network :private_network, :ip =&gt; "XXX"  
   box1.vm.hostname = "box1"  
   box1.vm.synced_folder ENV['HOME'], "/home/vagrant/home"  
   box1.vm.provider "virtualbox" do |v|  
    v.customize ['modifyvm', :id, '--memory', '4096']  
    v.customize ['modifyvm', :id, '--cpus', '2']  
   end  
   box1.vm.provision :puppet do |puppet|  
    puppet.manifests_path = "./manifests"  
    puppet.module_path = "./modules"  
    puppet.manifest_file = "site.pp"  
    puppet.options = "--verbose"  
   end  
  end  
 end

Try to install vagrant-proxyconf, and you get a barrage of errors if your Cygwin doesn't have the required libraries. So I'm gonna take it from the top here.

First thing, getting Ruby. Using Ruby from http://rubyinstaller.org/downloads/ caused headaches with running "gem", where I had to add "gem.bat" to my .bashrc file as an "alias", so I just stuck with the setup.x86_x64.exe Cygwin method.

Next, getting RubyGems. Get it here: https://rubygems.org . Put it in C:/RubyGems
Install RubyGems:
cd C:/RubyGems
ruby setup.rb install

Required Cygwin libraries (setup.86_x64.exe again)
libcrypt-devel
gcc (I had to search gcc then select the whole "Devel" package to make sure I got this, change the dropdown "Devel (Default)" to "Devel (Install)" if not clear)
make

Now, if you run "vagrant plugin install vagrant-proxyconf" now, you'll get this error:

DEBUG [dc362284] Bundler::GemNotFound: Could not find json-1.8.1.gem for installation
DEBUG [dc362284] An error occurred while installing json (1.8.1), and Bundler cannot continue.
DEBUG [dc362284] Make sure that `gem install json -v '1.8.1'` succeeds before bundling.

So run "gem install json -v '1.8.1'. If you get problems with this, it's because you didn't have gcc,make or libcrypt.
Then run "vagrant plugin install vagrant-proxyconf". What I ran into was:
Make sure that `gem install json -v '1.8.1'` succeeds before bundling.

Wtf? But my gem installed successfully! After finding out I had another Cygwin installation (though not on any path), I removed that folder and it still didn't work. I reinstalled Vagrant and it FINALLY...progresses.

It gets to the part to download the box, and then spits another error:
C:/HashiCorp/Vagrant/embedded/lib/ruby/2.0.0/uri/common.rb:176:in `split': bad URI(is not URI?): file://C:\cygwin\home\michart\base (URI::InvalidURIError)
Ruby doesn't parse Windows paths! Or maybe it's Cygwin being stupid. You would expect a widely used scripting language to have figured this out by now. So what I did was add this to common.rb:


    # Returns a split URI against regexp[:ABS_URI]
    def split(uri)
      uri.gsub!('\\','/')
      case uri
      when ''
      ...

Voila, it tries to download the box! But then I hit yet another problem...
An error occurred while downloading the remote file. The error
message, if any, is reproduced below. Please fix this error and try
again.

Couldn't open file /cygwin/home/michart/base

Well this one's probably just worded horribly, but my proxy was inaccessible, corporate VPN client looks like it's down over the weekend...blah.
Do note however, when it's just a simple "vagrant up" with a "vagrant init" Vagrantfile, I had no issues. This looked like extra steps I needed for "vagrant plugin install" to work.
Hope this helps anyone trying to get Vagrant working!