VaibhaV Sharma

life @ vsharma . net

Seaplane Flying in Alaska

Share Button

This is an edited video of my Seaplane Flying adventure in Alaska in August 2016 –

Enterprise Class Personal Storage

Share Button

With a growing list of devices in our day to day lives, personal data storage needs have expanded beyond a single machine’s storage.

Functionally, this is similar to paper and non-digital copy days where one would end up dealing with the following –

  • Printed photographs
    • Stored in albums or just envelopes stacked somewhere
  • Original, copies and hand-written documents of various types
    • Photocopies of documents
    • Printed documents
    • Stored in filing cabinets, very hard to organize
  • Audio and Video
    • Vinyl records (gaining traction again)
    • Music audio tapes
    • Home video tapes
    • Entertainment / movies (VHS, CD, DVD and older)
    • Phone voicemail tapes
    • Home security camera footage
    • Stored physically next to the entertainment center in a shelf or storage totes


Here is my list of personal data storage needs –

  • Photographs and Videos
    • 10,000 photos and growing
    • 800GB+ of personal videos and growing
    • From cell phones
      • Uploaded to social media
      • Copied to local HDD in a date organized folder structure
      • Backed up with phone backups on the desktop
    • From DSLR
      • Copied to local HDD in a date organized folder structure
    • From GoPro and other cameras (after motorcycle or flying trip)
      • Copied to local HDD in a date organized folder structure
    • From other people’s devices
      • Received on email
      • Received through social media, manually saved somewhere
    • From security cameras
      • Stored in a DVR or shared storage device
    • Processed content generated with video / photo editing software
  • Documents
    • Financial – Bank statements, shopping receipts, etc.
    • DMV registration and payment receipts
    • Medical – Bills, insurance and payment receipts
    • Home mortgage, insurance and warranty documents
    • Personal documents – passport, travel history, old mark sheets and certificates
  • Books
    • Kindle / iBooks – cloud sources
    • PDF and other formats
  • Audio / Video / Entertainment
    • iTunes, Amazon MP3 and other sources
    • MP3 collection encoded from audio CDs
    • Cached copies of iTunes and other online media


ALL of the above are now available for download / local storage or accessible on demand from the cloud. One could argue that everything can be uploaded to one cloud service or another but that quickly gets very expensive and those cloud services often don’t survive through mergers and acquisitions with competition.


Of course enterprise storage needs extend beyond the basics but if you consider the “functional requirements” for an average personal data storage solution, they are not very different from basic enterprise storage needs –

  • Must maintain data consistency
    • Should not corrupt data without external influence
    • Must handle scenarios including corruption initiated between the keyboard and the chair
  • Must be redundant
    • Failure of a single (or even double) storage component should not cause complete loss of data
  • Accessibility
    • Based on personal preference, data must be available on multiple devices
  • Availability
    • Must be up and running when needed, in some cases for days (weekends?)
  • Must support Disaster Recovery
    • Home fire scenario
    • One copy stored off-site
  • Speed
    • Must be fast enough to handle regular data backups without hours of transfer time
    • Must be fast enough to handle photo and video processing workloads without choking the editing workflow
  • Upgradable / Expandable
    • Must support swapping out disks to add more space in the future
  • Cost
    • Must not cost a fortune
  • Support
    • For commercial products, some level of support would be good


Here are some of the storage options that were traditionally considered only in enterprise settings –

Larger Internal Storage

  • Data Consistency – Good
  • Redundancy – Possible but most consumer desk/laptops do not include RAID
  • Accessibility – Limited to local machine, except network sharing which is not very convenient
  • Availability – Good
  • Disaster Recovery – Needs extra work
  • Speed
    • Good – Typically 5400 RPM / 3Gbps SATA (slow)
    • Better – 7200 RPM / 6Gbps SATA. 12Gbps options not yet that common
    • Best – Larger SSDs – Still very expensive
  • Expandability – Possible but painful to reinstall / copy everything
  • Cost – Varies with needs
  • Support – Bundled with unit

DAS – External Direct Attached Storage

  • Data Consistency – Can vary with unit
  • Redundancy – Possible with Hardware or Software RAID
  • Hardware RAID is more expensive
  • Software RAID is less reliable
  • Accessibility – Limited to local machine, except network sharing which is not very convenient
  • Availability – Good
  • Disaster Recovery – Needs extra work
  • Speed – Depends on interface and disk speed
  • Expandability – Possible but most RAID units force you to re-create / re-copy everything
  • Cost – Varies
  • Support – Bundled with unit
  • Examples – DROBO, OWC ThunderBay, etc.

NAS – Network Attached Storage

  • Data Consistency – Can vary with unit
  • Redundancy – Possible with Hardware or Software RAID
  • Hardware RAID is more expensive
  • Software RAID is less reliable
  • Accessibility – Very good
  • Availability – Good
  • Disaster Recovery – Needs extra work
  • Speed – Slow – Network access adds to latency and transfer speeds
  • Expandability – Possible but most RAID units force you to re-create / re-copy everything
  • Cost – Expensive
  • Support – Bundled with unit
  • Examples – Lot of them – DROBO, Synology, QNAP, Netgear ReadyNAS, Buffalo Linkstation, etc.

Cloud Storage

  • Data Consistency – Good
  • Redundancy – Good
  • Accessibility – Good
  • Availability – Good
  • Disaster Recovery – Good
  • Speed – Very Slow – Network access adds to latency and transfer speeds
  • Expandability – Very easy, not a factor
  • Cost
    • Expensive for online mount-able access like Dropbox or
    • Cheap if only used for off-site data dumps –
  • Support – Varies by provider
  • Examples – Backblaze, Dropbox,, Various cloud drives – iCloud, Google Drive, etc.


Reading through the options above, there is no single solution to the list of “functional needs”. To cover everything, a combination of the following is typically what works –

  • Medium sized local SSD storage
    • Provides speed and availability
  • (Optional) Large direct attached storage
    • Provides speed, expandability, availability
  • Large Network attached storage
    • Provides expandability, availability and accessibility
  • Data replication to the Cloud for off-site storage
    • Provides Disaster recovery, availability, accessibility

Data transfer speeds for local, DAS and NAS vary wildly depending on the components used, specially the choice of hard drives, SSDs and controllers.

But that is a topic for another article.

Dealing with Legacy Infrastructure

Share Button

Dealing with legacy infrastructure is like dealing with the sunset scenario of an old car. If timed right, maximum ROI can be extracted out of the old car before investing in the latest and greatest technology of a new car. Migrating infrastructure up to the cloud is one transition option but there still are significant use cases for physical on-premises or self-hosted infrastructure.

Any organization that has grown “organically” and/or has been in business for more than a few years, ultimately needs to deal with legacy infrastructure. As organizations get older, this cycle repeats and hopefully gets better every few years.

Legacy infrastructure and legacy applications are related but two different problems. In this context, infrastructure is Servers, Switches, Firewalls, Routers, Storage, Racks, Cabling, Power and PDUs.

Legacy infrastructure is –

  • EOS (End Of Sale) / EOL (End of Life) / EOSup (End of Support)
    • The manufacturer will not sell, repair, upgrade or support
  • About to go EOS / EOL / EOSup
    • Manufacturer will support it but requires signing your next child’s life earnings to them
  • Was being maintained but that team member left / was let go
  • Bought this expensive unit for project X but that never took off

Why Upgrade?
Ideally, legacy infra management starts at the time of purchase. Proper architecture, procurement and implementation strategy can significantly aid with maximum ROI over the years.

Mature organizations (large and small) tend to have a policy driven cycle of regular upgrades fueled by engineering decisions, asset depreciation or even compliance certification. For SLA driven hosted service providers, planning for upgrades is (or should be) a critical part of infrastructure strategy.

Full Stacks Migrations
Top to bottom, a typical infrastructure stack looks like this –

– (Out of Scope) OS / Hypervisor / Application
– Servers
– Storage
– Layer 3+ Devices – Firewalls / Routers
– Layer 2 – Switches
– Cabling
– Racks
– PDUs
– Power Circuits

It is a lot of work but a lot cleaner to do full stack migrations. Full stack migrations keep project managers employed. Design and build a new stack (Power Circuits to OS) and do a massive coordinated application migration to the new stack. In most situations, this is not an option.

Partial upgrades at individual layers are messy but more common. This keeps systems admins busy (and employed).

Power Circuits and PDUs
As applications scale, more resource demands drive the need for higher capacity or higher density equipment. If the building has more power available, run new circuits with different / higher power capacity (220V instead of 120V. Or 3 phase) and then swap out old PDUs. One typical roadblock with power is cooling capacity. Some cooling efficiency can be gained by adding cold isle containment.

Power circuits + PDUs can be swapped out without downtime if old PDUs and power cabling have been installed properly. If old cabling is a mess, scheduled downtime might be a better option than blowing fuses and power supplies when cables come loose.

Most PDUs have a long life and can be reused. Except some of the management features, flexible designs and increased densities, PDUs have not seen much innovation over the years. They don’t fetch much in the secondhand market either. So might as well keep them and reuse.

Most older racks can be reused as-is unless new designs demand for better cabling, physical security and cooling. Space permitting, rear extensions can be added to some racks to accommodate additional accessories, better cable management or bigger PDUs. When picking racks, deeper, wider, taller, the better.

Old network and power cables can be reused unless if connectivity designs have changed. Different cable type (Cat7A vs. Cat6. PDU style vs. 3-prong power) typically does not make much difference for average applications unless the equipment is really pushing close to data and power limits. Cabling upgrades, if done right, are typically one of the most expensive processes in terms of labor cost and time. But cabling done right can often survive next several iterations of legacy infrastructure refresh. Lab environments are different where rip and replace (or re-run) old cables is the best policy.

Cabling designs deserve their own post. Something for the future.

Network Switches
Upgrading a 10 year old 1Gb copper network switch with a latest generation 1Gb copper network switch might not be very beneficial except if the backbone network has moved to different, higher speed interfaces. Network switch upgrades can be very disruptive if critical servers do not have multiple links into the access / storage layers. Core switch upgrades are even worse if cabling and other issues are not considered when provisioning them.

Network switches do have some resale value in the secondhand market. Older switches are also a very good option for lab environments, test stacks or even training / R&D. Swapping network vendors between upgrades is sometimes a religious and team morale issue but is possible. I know because I have been there, done that, survived and apparently, people still like me.

Layer 3+ Devices – Firewalls / Routers
In a stable environment with most of the flux at the application layer, edge routers often do not see much change over the years. Except for occasional software upgrades and configuration updates, edge routers sit rock solid. Edge / core router upgrades are easier if appropriate routing and redundancy (HSRP?) protocols are implemented. Typical upgrade process involves replacing the standby unit, switching traffic to the new unit and then doing the same with the active unit.

Firewalls are a bit more involved in application stacks and require constant security upgrades, configuration changes, etc. As traffic grows, firewalls run out of capacity and need to be replaced. Most decent firewalls do have redundancy options. If connection state replication is
an option, pretty much no disruption is expected but most firewall changes are very disruptive.

Routers and firewalls, with advance planning and proper implementation, can prevent or at least reduce downtime significantly.

DAS, NAS, SAN – out of the 3 flavors, DAS is obviously the most disruptive as physical changes are needed.

NAS and SAN upgrades are easier if the underlying protocol remains the same. Without depending on the storage vendor, NAS and SAN upgrades can be painless (not painfree) if the application supports online data migration.

For e.g. VMWare can do “Storage Motion” across two storage targets. Entire virtual machine can be moved to newer storage without any downtime. Oracle RAC on the other hand ties in tightly with the storage backend with shared volumes for data, voting disks and other cluster uses. Downtime is almost guaranteed in most cases unless if storage layer clustering magic saves the day.

Most of the older storage units do not have clustering or non-disruptive data replication and transition options. Newer units have the option to add new cluster partners and migrate services over transparently. All of that sounds very nice but this is not possible if the old unit does not have clustering configured right.

Once decommissioned, old storage can be re-provisioned as backup storage for non-critical environments or used for internal training / R&D.

If an application is running well with no server capacity issues, they are best kept as-is. If it ain’t broken, don’t fix it. If the hardware is out of warranty, keep a few spare units to cannibalize for parts. Second hand units of old models often flood the secondhand market in batches and make for very good spare part bins. If the team has some skill and time available, old servers make for excellent clusters. A nice hadoop cluster with large NFS storage pool for backups or even simpler storage cluster using Gluster (e.g.) is easily possible.

Older servers with more RAM and faster HDDs fetch more in the secondhand market.

What to do with an old pile of hardware?
Here is where company policy and culture can make a big difference. To extract maximum ROI out of old hardware, some level of inventory and resource management skills are needed. Some creative re-assignment of old hardware can provide significant boost to newer POC initiatives with temporary and ever changing resource needs.

There are several other ways to deal with old hardware –

  • Resell / liquidate
  • Donate it (Tax Benefit?)
  • Re-use in test / non-critical environments
  • Use as a break / fix / ops training tool
  • Give it away as an employee perk. Lot of people run a small datacenters in their garage at home.

Infrastructure management is a lot of hard work but provides necessary resources to make an organization successful. If not managed well, infrastructure tends to be a large sinkhole for hard cash.

Do not blindly design and run your infrastructure with the aspirations of copying Google or Yahoo!. Every organization is different.


Passive Network Security Monitoring

Share Button

Most people visualize “IT security” as – sophisticated, protected by body builders with dark glasses, men-in-black type images and Firewalls !! Focusing just on network security, one way to slice it would be – active and passive network security.

Network security is a constant battle of keeping up with new software / system exploit techniques. Network and application traffic needs to be constantly monitored to identify new exploit patterns. Passive monitoring tools can record, analyze, correlate and produce highly valuable security intel specific to a network.

You don’t need to shell out a pentabillion $$ for turnkey commercial solutions. Free / Open Source community has a lot of it covered. It does help if you know what you are doing.

Active (in-line) monitoring typically includes “bump in the wire” type solutions –

  • Firewalls (yeeaah!)
  • Malware scanners (Spam, Phishing, Virus)
  • Whitelisting / blacklisting at various layers
  • Encryption

Active measures are good first steps but they are only as effective as the signature data and configuration driving them. Every organization’s traffic profile is different and a lot of times boilerplate active measures are not very effective or go stale very quickly.

Most firewalls are configured to block or allow combinations of IP / port / protocol. Some with more resources and features can do DPI (Deep Packet Inspection) to catch malware or intrusion attempts and also function as IPS (Intrusion Prevention). Malware scanners depend on pre-configured patterns of known bad attachments or phishing URLs Whitelisting / blacklisting rules need to be updated on a regular basis to be effective.

A passive monitoring system can be configured to parse a copy of live network traffic, flag known anomalies and take action or log it for a human to look at. Someone then does all the hard work of identifying new patterns and publishing them for general consumption. Thanks to Free / Open Source projects, a lot of this work is available in the open.

A good passive monitoring engine –

  • Can consume and keep up with monitored traffic
  • Can parse and de-construct connection flows on the fly
  • Can log any / all flow metadata (as configured) for correlation
  • Can apply pre-defined identification rules and flag suspicious activities
  • Has flexible configuration to define new patterns on the fly

There are several mature Free / Open Source projects that can help.

Snort / Suricata
Link: Suricata
Link: Snort

Snort used to be the defacto IDS / IPS engine of choice for anyone looking to run an IDS. Somewhere along the way, like any other wildly popular Open Source project, it was blessed and run by a commercial entity. Some people were not happy and Snort codebase was forked into the Suricata project.

Snort / Suricata engines have a rich set of community supported and commercial rules available. It can run on an edge machine (router / firewall), monitor all network traffic and the flag and/or control bad traffic from flowing through.

Snort / Suricata have some fantastic integration features with analytics and search/indexing tools. More details here.

Bro is one my favorite tools!

The “IDS” tag in the name (been fixed) is unfortunate because it is a general purpose programmable network monitoring platform that does a fine job as an IDS. It can also be programmed to take action to control edge devices for an IPS type setup. Bro engine is driven by program like scripts that define patterns to be matched, ignored or alerted.

Bro is known to run on commodity hardware and scaled up to 100Gbps. Here is a berkley paper on 100Gbps IDS, powered by Bro. I am not doing justice to Bro’s capabilities by writing a small paragraph here. This deserves its own article. Something for the future.

Security Onion
Link: SecurityOnion
Bro and Snort are just the tip of the mountain of network security monitoring tools. There is a whole slew of logging, parsing, indexing and search infrastructure tools that can be integrate with these engines to enhance their use cases.

Security Onion is a pre-packaged distribution that includes Bro and Snort + a long list of other tools that work out of the box after installation. It also lets you distribute sensors at multiple points in a network and consolidate the collected data into a central location. A good starting point would be to bring up security onion in a VM and feed pre-captured traffic. Both Bro and snort can consume .pcap files. Absolutely fantastic work by the Security Onion team.

Security monitoring is very hard work but very exciting and rewarding. There is a huge trove of software available. There is no one right way to do it.


Software Defined Everything

Share Button

Information Technology world lives and breathes buzzwords. No doubt there is innovation happening but the over hype-fication by crafty marketing teams almost borders the snake-oil experience. For example, here is what Google says about Cloud Computing –

Cloud Com-put-ing
The practice of using a network of remote servers hosted on the internet to store, manage and process data, rather than a local server or a personal computer.

By that definition, my Yahoo email account and first website on Geocities (circa 199x) were made possible by cloud computing. But where were all the cloud certification courses, cloud strategy consultants, private cloud platforms, cloud this and cloud that? The term cloud computing started buzz-ing only in the past 10 years. Lipstick on a pig and tadaaa!! hosted services Cloud Computing!

Type “Software Defined” in a search engine and the first few pages are filled with results for “Software Defined Networks” and “Software Defined Storage”. Software defined concepts are not new.

Software Defined Software
When code written in C/C++ or any other high level language is compiled, it results in machine executable instructions being generated.

That is “software defining software”.

High level languages make software development easier and accessible to the general public, who otherwise would not fare well dealing with assembly language. Also, the pace of progress would very slow. With libraries and modules being built to make life easier for future generation of programmers, these layers are simply put – software defining software that defines software.

Software Defined Hardware
Emulation and Virtualization.

Hardware Emulation translates functions of one type of hardware on other hardware platforms. Unicorn CPU emulator is a good example. Emulation is very useful in prototyping and is also used to implement portable code. One of the tools popular with hardware product teams are FPGAs. These generic programmable logic boards have become a key part of hardware design processes and they also end up being used for the final product. Here is a cool Nintendo emulator implementation on FPGA.

Hardware Virtualization has become a huge part of IT solutions. Primary selling points were consolidation and efficient resource usage. But virtualization has expended into several other use cases as well. In the past 10 years, virtualization has completely transformed IT infrastructure architectures, both hardware and software. The next wave is containerization for next generation “Cloud Computing” implementations. Everyone wants to be Google!

Software Defined Sound and Light
With entertainment systems going digital, content digitization added a whole new set of capabilities. Sound waves (voice/music) and light (photo/video) are sampled by hardware sensors thousands of times a second and a binary representation is created. More frequent samples-per-second create higher resolution data.

This digital representation is then used to recreate the original content on a screen. Music and video data defines what the re-playing software does to pixels on the screen.

That is? Software defined sound and light!

This completely revolutionized entertainment as we know it. Content can be copied, filtered and processed with special effects on the fly, all in software. New filters and processing techniques can be added as new algorithms are developed. This massively improved computer gaming user experience.

Remember .MOD and .STM files? MIDI? DosBox (Dos Emulator) works very well. Inertia Player, STM player. Fun!

Software Defined Control Systems
Fly by wire?

  • Pilots move a physical cockpit control (yoke or joystick)
  • Hardware sensors read that movement and convert it to digital sensor data
  • Airplane control system software computes an appropriate control surface response
  • Software then controls hydraulic actuators to move airplane control surfaces (aileron, elevator, flaps)
  • Combine this with a flight management system that has GPS location data, airspace information and software can pretty much fly the airplane from takeoff to landing

By defining control surface movements in software, a computer can help pilots fly with more precision and automation. As more data from newer sensors is made available, software changes can add newer flight tools in the cockpit.

Software Defined Radio
I have always believed that radio geeks are geekier than IT geeks. SDRs allow IT and Radio geekery to be combined into super geeks. SDRs simplify radio hardware devices and allow them to be coupled with powerful and flexible software. This combination completely blows away roadblocks in the way of radio technology innovation.

Amateur radio and other hobbyists now have a rich toolset to experiment with and build on top of the basic hardware / software combination. Take a look at this excellent introduction to SDRs by a very bright ~10 yrs old –

Not everything is snake-oil in the recent wave of software defined technology updates. Its just that the marketing hype needs to be sifted through to really understand where real innovation is happening.

That is a topic for a future post.

« Older posts

© 2021 VaibhaV Sharma

Theme by Anders NorenUp ↑