VaibhaV Sharma

life @ vsharma . net

Dealing with Legacy Infrastructure

Share Button

Dealing with legacy infrastructure is like dealing with the sunset scenario of an old car. If timed right, maximum ROI can be extracted out of the old car before investing in the latest and greatest technology of a new car. Migrating infrastructure up to the cloud is one transition option but there still are significant use cases for physical on-premises or self-hosted infrastructure.

Any organization that has grown “organically” and/or has been in business for more than a few years, ultimately needs to deal with legacy infrastructure. As organizations get older, this cycle repeats and hopefully gets better every few years.

Legacy infrastructure and legacy applications are related but two different problems. In this context, infrastructure is Servers, Switches, Firewalls, Routers, Storage, Racks, Cabling, Power and PDUs.

Legacy infrastructure is –

  • EOS (End Of Sale) / EOL (End of Life) / EOSup (End of Support)
    • The manufacturer will not sell, repair, upgrade or support
  • About to go EOS / EOL / EOSup
    • Manufacturer will support it but requires signing your next child’s life earnings to them
  • Was being maintained but that team member left / was let go
  • Bought this expensive unit for project X but that never took off

Why Upgrade?
Ideally, legacy infra management starts at the time of purchase. Proper architecture, procurement and implementation strategy can significantly aid with maximum ROI over the years.

Mature organizations (large and small) tend to have a policy driven cycle of regular upgrades fueled by engineering decisions, asset depreciation or even compliance certification. For SLA driven hosted service providers, planning for upgrades is (or should be) a critical part of infrastructure strategy.

Full Stacks Migrations
Top to bottom, a typical infrastructure stack looks like this –

– (Out of Scope) OS / Hypervisor / Application
– Servers
– Storage
– Layer 3+ Devices – Firewalls / Routers
– Layer 2 – Switches
– Cabling
– Racks
– PDUs
– Power Circuits

It is a lot of work but a lot cleaner to do full stack migrations. Full stack migrations keep project managers employed. Design and build a new stack (Power Circuits to OS) and do a massive coordinated application migration to the new stack. In most situations, this is not an option.

Partial upgrades at individual layers are messy but more common. This keeps systems admins busy (and employed).

Power Circuits and PDUs
As applications scale, more resource demands drive the need for higher capacity or higher density equipment. If the building has more power available, run new circuits with different / higher power capacity (220V instead of 120V. Or 3 phase) and then swap out old PDUs. One typical roadblock with power is cooling capacity. Some cooling efficiency can be gained by adding cold isle containment.

Power circuits + PDUs can be swapped out without downtime if old PDUs and power cabling have been installed properly. If old cabling is a mess, scheduled downtime might be a better option than blowing fuses and power supplies when cables come loose.

Most PDUs have a long life and can be reused. Except some of the management features, flexible designs and increased densities, PDUs have not seen much innovation over the years. They don’t fetch much in the secondhand market either. So might as well keep them and reuse.

Racks
Most older racks can be reused as-is unless new designs demand for better cabling, physical security and cooling. Space permitting, rear extensions can be added to some racks to accommodate additional accessories, better cable management or bigger PDUs. When picking racks, deeper, wider, taller, the better.

Cabling
Old network and power cables can be reused unless if connectivity designs have changed. Different cable type (Cat7A vs. Cat6. PDU style vs. 3-prong power) typically does not make much difference for average applications unless the equipment is really pushing close to data and power limits. Cabling upgrades, if done right, are typically one of the most expensive processes in terms of labor cost and time. But cabling done right can often survive next several iterations of legacy infrastructure refresh. Lab environments are different where rip and replace (or re-run) old cables is the best policy.

Cabling designs deserve their own post. Something for the future.

Network Switches
Upgrading a 10 year old 1Gb copper network switch with a latest generation 1Gb copper network switch might not be very beneficial except if the backbone network has moved to different, higher speed interfaces. Network switch upgrades can be very disruptive if critical servers do not have multiple links into the access / storage layers. Core switch upgrades are even worse if cabling and other issues are not considered when provisioning them.

Network switches do have some resale value in the secondhand market. Older switches are also a very good option for lab environments, test stacks or even training / R&D. Swapping network vendors between upgrades is sometimes a religious and team morale issue but is possible. I know because I have been there, done that, survived and apparently, people still like me.

Layer 3+ Devices – Firewalls / Routers
In a stable environment with most of the flux at the application layer, edge routers often do not see much change over the years. Except for occasional software upgrades and configuration updates, edge routers sit rock solid. Edge / core router upgrades are easier if appropriate routing and redundancy (HSRP?) protocols are implemented. Typical upgrade process involves replacing the standby unit, switching traffic to the new unit and then doing the same with the active unit.

Firewalls are a bit more involved in application stacks and require constant security upgrades, configuration changes, etc. As traffic grows, firewalls run out of capacity and need to be replaced. Most decent firewalls do have redundancy options. If connection state replication is
an option, pretty much no disruption is expected but most firewall changes are very disruptive.

Routers and firewalls, with advance planning and proper implementation, can prevent or at least reduce downtime significantly.

Storage
DAS, NAS, SAN – out of the 3 flavors, DAS is obviously the most disruptive as physical changes are needed.

NAS and SAN upgrades are easier if the underlying protocol remains the same. Without depending on the storage vendor, NAS and SAN upgrades can be painless (not painfree) if the application supports online data migration.

For e.g. VMWare can do “Storage Motion” across two storage targets. Entire virtual machine can be moved to newer storage without any downtime. Oracle RAC on the other hand ties in tightly with the storage backend with shared volumes for data, voting disks and other cluster uses. Downtime is almost guaranteed in most cases unless if storage layer clustering magic saves the day.

Most of the older storage units do not have clustering or non-disruptive data replication and transition options. Newer units have the option to add new cluster partners and migrate services over transparently. All of that sounds very nice but this is not possible if the old unit does not have clustering configured right.

Once decommissioned, old storage can be re-provisioned as backup storage for non-critical environments or used for internal training / R&D.

Servers
If an application is running well with no server capacity issues, they are best kept as-is. If it ain’t broken, don’t fix it. If the hardware is out of warranty, keep a few spare units to cannibalize for parts. Second hand units of old models often flood the secondhand market in batches and make for very good spare part bins. If the team has some skill and time available, old servers make for excellent clusters. A nice hadoop cluster with large NFS storage pool for backups or even simpler storage cluster using Gluster (e.g.) is easily possible.

Older servers with more RAM and faster HDDs fetch more in the secondhand market.

What to do with an old pile of hardware?
Here is where company policy and culture can make a big difference. To extract maximum ROI out of old hardware, some level of inventory and resource management skills are needed. Some creative re-assignment of old hardware can provide significant boost to newer POC initiatives with temporary and ever changing resource needs.

There are several other ways to deal with old hardware –

  • Resell / liquidate
  • Donate it (Tax Benefit?)
  • Re-use in test / non-critical environments
  • Use as a break / fix / ops training tool
  • Give it away as an employee perk. Lot of people run a small datacenters in their garage at home.

Infrastructure management is a lot of hard work but provides necessary resources to make an organization successful. If not managed well, infrastructure tends to be a large sinkhole for hard cash.

Do not blindly design and run your infrastructure with the aspirations of copying Google or Yahoo!. Every organization is different.

YMMV.

Passive Network Security Monitoring

Share Button

Most people visualize “IT security” as – sophisticated, protected by body builders with dark glasses, men-in-black type images and Firewalls !! Focusing just on network security, one way to slice it would be – active and passive network security.

Network security is a constant battle of keeping up with new software / system exploit techniques. Network and application traffic needs to be constantly monitored to identify new exploit patterns. Passive monitoring tools can record, analyze, correlate and produce highly valuable security intel specific to a network.

You don’t need to shell out a pentabillion $$ for turnkey commercial solutions. Free / Open Source community has a lot of it covered. It does help if you know what you are doing.

Active (in-line) monitoring typically includes “bump in the wire” type solutions –

  • Firewalls (yeeaah!)
  • Malware scanners (Spam, Phishing, Virus)
  • Whitelisting / blacklisting at various layers
  • Encryption

Active measures are good first steps but they are only as effective as the signature data and configuration driving them. Every organization’s traffic profile is different and a lot of times boilerplate active measures are not very effective or go stale very quickly.

Most firewalls are configured to block or allow combinations of IP / port / protocol. Some with more resources and features can do DPI (Deep Packet Inspection) to catch malware or intrusion attempts and also function as IPS (Intrusion Prevention). Malware scanners depend on pre-configured patterns of known bad attachments or phishing URLs Whitelisting / blacklisting rules need to be updated on a regular basis to be effective.

A passive monitoring system can be configured to parse a copy of live network traffic, flag known anomalies and take action or log it for a human to look at. Someone then does all the hard work of identifying new patterns and publishing them for general consumption. Thanks to Free / Open Source projects, a lot of this work is available in the open.

A good passive monitoring engine –

  • Can consume and keep up with monitored traffic
  • Can parse and de-construct connection flows on the fly
  • Can log any / all flow metadata (as configured) for correlation
  • Can apply pre-defined identification rules and flag suspicious activities
  • Has flexible configuration to define new patterns on the fly

There are several mature Free / Open Source projects that can help.

Snort / Suricata
Link: Suricata
Link: Snort

Snort used to be the defacto IDS / IPS engine of choice for anyone looking to run an IDS. Somewhere along the way, like any other wildly popular Open Source project, it was blessed and run by a commercial entity. Some people were not happy and Snort codebase was forked into the Suricata project.

Snort / Suricata engines have a rich set of community supported and commercial rules available. It can run on an edge machine (router / firewall), monitor all network traffic and the flag and/or control bad traffic from flowing through.

Snort / Suricata have some fantastic integration features with analytics and search/indexing tools. More details here.

Bro IDS
Bro is one my favorite tools!

The “IDS” tag in the name (been fixed) is unfortunate because it is a general purpose programmable network monitoring platform that does a fine job as an IDS. It can also be programmed to take action to control edge devices for an IPS type setup. Bro engine is driven by program like scripts that define patterns to be matched, ignored or alerted.

Bro is known to run on commodity hardware and scaled up to 100Gbps. Here is a berkley paper on 100Gbps IDS, powered by Bro. I am not doing justice to Bro’s capabilities by writing a small paragraph here. This deserves its own article. Something for the future.

Security Onion
Link: SecurityOnion
Bro and Snort are just the tip of the mountain of network security monitoring tools. There is a whole slew of logging, parsing, indexing and search infrastructure tools that can be integrate with these engines to enhance their use cases.

Security Onion is a pre-packaged distribution that includes Bro and Snort + a long list of other tools that work out of the box after installation. It also lets you distribute sensors at multiple points in a network and consolidate the collected data into a central location. A good starting point would be to bring up security onion in a VM and feed pre-captured traffic. Both Bro and snort can consume .pcap files. Absolutely fantastic work by the Security Onion team.

Security monitoring is very hard work but very exciting and rewarding. There is a huge trove of software available. There is no one right way to do it.

YMMV.

Software Defined Everything

Share Button

Information Technology world lives and breathes buzzwords. No doubt there is innovation happening but the over hype-fication by crafty marketing teams almost borders the snake-oil experience. For example, here is what Google says about Cloud Computing –

Cloud Com-put-ing
The practice of using a network of remote servers hosted on the internet to store, manage and process data, rather than a local server or a personal computer.

By that definition, my Yahoo email account and first website on Geocities (circa 199x) were made possible by cloud computing. But where were all the cloud certification courses, cloud strategy consultants, private cloud platforms, cloud this and cloud that? The term cloud computing started buzz-ing only in the past 10 years. Lipstick on a pig and tadaaa!! hosted services Cloud Computing!

Type “Software Defined” in a search engine and the first few pages are filled with results for “Software Defined Networks” and “Software Defined Storage”. Software defined concepts are not new.

Software Defined Software
When code written in C/C++ or any other high level language is compiled, it results in machine executable instructions being generated.

That is “software defining software”.

High level languages make software development easier and accessible to the general public, who otherwise would not fare well dealing with assembly language. Also, the pace of progress would very slow. With libraries and modules being built to make life easier for future generation of programmers, these layers are simply put – software defining software that defines software.

Software Defined Hardware
Emulation and Virtualization.

Hardware Emulation translates functions of one type of hardware on other hardware platforms. Unicorn CPU emulator is a good example. Emulation is very useful in prototyping and is also used to implement portable code. One of the tools popular with hardware product teams are FPGAs. These generic programmable logic boards have become a key part of hardware design processes and they also end up being used for the final product. Here is a cool Nintendo emulator implementation on FPGA.

Hardware Virtualization has become a huge part of IT solutions. Primary selling points were consolidation and efficient resource usage. But virtualization has expended into several other use cases as well. In the past 10 years, virtualization has completely transformed IT infrastructure architectures, both hardware and software. The next wave is containerization for next generation “Cloud Computing” implementations. Everyone wants to be Google!

Software Defined Sound and Light
With entertainment systems going digital, content digitization added a whole new set of capabilities. Sound waves (voice/music) and light (photo/video) are sampled by hardware sensors thousands of times a second and a binary representation is created. More frequent samples-per-second create higher resolution data.

This digital representation is then used to recreate the original content on a screen. Music and video data defines what the re-playing software does to pixels on the screen.

That is? Software defined sound and light!

This completely revolutionized entertainment as we know it. Content can be copied, filtered and processed with special effects on the fly, all in software. New filters and processing techniques can be added as new algorithms are developed. This massively improved computer gaming user experience.

Remember .MOD and .STM files? MIDI? DosBox (Dos Emulator) works very well. Inertia Player, STM player. Fun!

Software Defined Control Systems
Fly by wire?

  • Pilots move a physical cockpit control (yoke or joystick)
  • Hardware sensors read that movement and convert it to digital sensor data
  • Airplane control system software computes an appropriate control surface response
  • Software then controls hydraulic actuators to move airplane control surfaces (aileron, elevator, flaps)
  • Combine this with a flight management system that has GPS location data, airspace information and software can pretty much fly the airplane from takeoff to landing

By defining control surface movements in software, a computer can help pilots fly with more precision and automation. As more data from newer sensors is made available, software changes can add newer flight tools in the cockpit.

Software Defined Radio
I have always believed that radio geeks are geekier than IT geeks. SDRs allow IT and Radio geekery to be combined into super geeks. SDRs simplify radio hardware devices and allow them to be coupled with powerful and flexible software. This combination completely blows away roadblocks in the way of radio technology innovation.

Amateur radio and other hobbyists now have a rich toolset to experiment with and build on top of the basic hardware / software combination. Take a look at this excellent introduction to SDRs by a very bright ~10 yrs old –

Not everything is snake-oil in the recent wave of software defined technology updates. Its just that the marketing hype needs to be sifted through to really understand where real innovation is happening.

That is a topic for a future post.

Is the new iPad Pro worth your money?

Share Button

Is the new iPad Pro worth your money? Maybe, Maybe not.

I don’t know. But thanks for clicking through to read this post. 🙂 Who am I to tell you if the new iPad Pro is worth your money or not. It is your money and it is for you to take that decision, maybe based on the classic “Need vs. Want” analysis. But why would you go through that? You know you don’t “need it” but you do “want it”. Right?

That point aside, just like a bollywood movies, there exist two dozen formulas for post titles that are recycled as “new and informative material”.

Like these –

  • 5 reasons why you should buy an iPad Pro
  • 5 reasons why you should not buy an iPad pro this holiday season
  • 10 reasons why Apple is doomed
  • 10 reasons why everyone is losing to Apple

Some of them do have interesting content but the title and the tone of these articles is so generalized that it just does not apply to everyone and does not make sense at all in the larger context.

I used to ignore them but with such titles sprinkled all over the web, you do sometimes get in the trap, end up reading through and then crawl out disappointed.

Try these google searches and see for yourself. Some interesting patterns emerge –

And my recent favorite –
5 Reasons Why People Who Cry A Lot Are Mentally Strong

To think of it now, I did read some of these formula articles and was mentally stronger every time. Thank you for reading through this one.

BTW, I do “want” and have 5 reasons for the iPad pro this holiday season. 🙂

 

Exchange ActiveSync – iPhone – Why certificate lookups don’t work?

Share Button

If you handle any kind of confidential material on your work email (most of us do), encrypted email is a must to ensure confidentiality and security of the material being moved around.

Warning: This is a long text post with lots of tech details. No pretty diagrams this time.

Other than the proprietary solutions for intra-company encryption, there are only a few “open standards” that two random organizations can use to exchange encrypted emails. S/MIME being one of the popular ones, if your organization already uses appropriate certificates for user authentication (Wifi, 802.1x, disk encryption, etc.), enabling email encryption could be a simple matter of configuring email client to use that cert for S/MIME email encryption.

Assuming all that is setup, an email sender still needs the intended recipient’s “public key” before an encrypted email can be sent. That can be exchanged manually using “signed emails” between two users but is a headache for more frequent certificate exchanges.

That is where enterprise directories like “Active Directory” can help. Microsoft exchange being one of the most popular email infrastructure choices, it is easy to publish all user certificates to the corporate directory. That way any MS Exchange compatible client can lookup recipient certs while composing an email.

Microsoft exchange can be accessed using a variety of client protocols. Most of those protocols expose similar interfaces and provide almost the same functionality from an email client’s perspective. The difference is in the transport protocol structure in use. These client protocols also provide the ability for the email client to “request recipient’s email certificate” on the fly.

So, publish each user’s cert with Active Directory / Exchange and the problem is solved, right? Not quite so.

For Microsoft’s own email client options – Outlook, etc. that use EWS to talk to Exchange CAS – Client Access Server, the cert lookup process works fine. Microsoft even has an IE ActiveX plugin to enable S/MIME on “Outlook Web Access”.

The problem currently exists with clients that use the Microsoft ActiveSync protocol to access Exchange services. If you configure Apple iPhone or Android Touchdown apps for Exchange, they use ActiveSync instead of EWS. ActiveSync protocol provides API calls (ResolveRecipients) to fetch a recipient’s email certificate. But when the client makes that call, the CAS ActiveSync process is unable to fetch the cert and returns a negative response.

We first came across this issue several years ago and after extensive online search and forum posts, concluded (like others) that this was an Apple iPhone Mail issue and kept cursing Apple for it. We also kept testing it with every IOS release with no success. The logic there was that if the fetch worked for EWS clients, ActiveSync should have the same result as it is the same CAS server.

But that was an incorrect assumption. Well, the issues are specific to each client. After hours of debugging, educating (yes) MS support engineers on S/MIME, here is the current summary –

For ActiveSync Clients (Apple iPhone Mail / Android Touchdown) –
The issue is with Microsoft ActiveSync Server – documented here.
Client Cert Request to CAS

CAS (ActiveSync) Response to Cert Lookup Request

For Apple Mail – OSX (uses EWS)
The problem is with Apple mail trying to use Keychain to lookup certs instead of EWS protocol. Keychain lookup does not work for OSX versions before El Capitan. Works on El Capitan only if the machine is joined to the windows domain and directory lookup is enabled on Keychain. This is not a practical solution if the OSX client is roaming outside corporate network with no LDAP access to the GAL / domain controllers to do a cert lookup query.

Outlook Mac 2016 (Latest version)
S/MIME cert lookups stopped working from version 15.11. Anything between 15.3 and 15.10 still work. This is a known issue with MS dev teams with no specific fix date.

Update: To get this to work, add the intermediate CA + the root CA cert chain to all EWS servers. Without that, if EWS server is unable to validate the cert chain, it will silently ignore received certs.

« Older posts

© 2016 VaibhaV Sharma

Theme by Anders NorenUp ↑