Kudos for Mailgun !!!

I was able to give Mailgun a try recently and all I can say is Kudos !!!

Finally someone who has made Email Servers easy to deal with !!!

I will go so far as to say, Mailgun has made dealing with Email easier than Google !!!

But then we have to remember that Google does not offer the same level of service as Mailgun.

I was able to get setup with Mailgun in 1-2 hours with very nice integration with Rackspace Cloud DNS.

I was previously using ZoneEdit for my DNS needs in this regard however I think it’s time to go with Rackspace.

Advertisements

Why does “crappy” code not perform worse than it seems to ?

I like good code like the next person however I am wondering why people spend so much time focusing on “crappy” code patterns that don’t seem to perform worse than expected ?!?

Here’s a thought:  If you want to complain about “crappy” code then why not fix all the crappy code you become aware of including all that stuff you cannot seem to understand !!!

Or maybe just realize the fact that your idea of “crappy” code may be another man’s treasure !!!   Oh wait, someone already thought of this when the following was coined, “…one man’s trash is another man’s treasure…”. Circa. 1570’s !!!

Or to put this another way…   those who feel some coding patterns may be “crappy” might just be limiting their own ability to perceive really useful code they would surely have overlooked !!!

I am NOT in favor of truly crappy code !!!

But, I am also NOT in favor of being so close-minded that I cannot see how useful alternative coding styles may be.

Single instances of “crappy” code cannot be crappy !!!

If you have to produce 1000’s or 1,000,000’s of iterations to see a performance problem from a single use chunk of code then there is very little reason to think about any such performance problem !!!

I might be impressed with those who rail against “crappy” code who also make darned sure all the code they can see in their field of vision is not also “crappy” code !!!

Scenario #1

For instance, consider the following pattern that was recently classified as being “crappy” by an intrepid band of hearty Python coders in an company we dare not name here…

This is “crappy” !

toks = ‘one two three four five six seven eight nine ten’.split()

This is not !

toks = [‘one’, ‘two’, ‘three’, ‘four’, ‘five’, ‘six’, ‘seven’, ‘eight’, ‘nine’, ‘ten’]

The crappy version was said to be “crappy” because the split was seen as being unnecessary because the non-crappy version was said to be the desired effect.

Those who said the crappy version was crappy have probably never had to load a list from a comma delimited jungle of data they may have had to spend several days entering manually by hand otherwise they may have gotten themselves some useful experience as to why the crappy version might be somewhat less crappy after-all.

Scenario #1 Benchmarks

I like benchmarks.  I like to know when code may perform badly at run-time at scale.

The crappy version for Scenario #1 runs 5x slower than the non-crappy version for 1,000 iterations.

The crappy version for Scenario #1 runs 7.6x slower than the non-crappy version for 10,000 iterations.

The crappy version for Scenario #1 runs 7.65x slower than the non-crappy version for 100,000 iterations.

The crappy version for Scenario #1 runs 7.66x slower than the non-crappy version for 1,000,000 iterations.

Um, if you turn-off the Python GC the performance issues seem to disappear for a while !!!  Just a thought…

Scenario #1 Analysis

The real question is this:  “Was the crappy code being used enough for the comments this code pattern elicited ?”  Probably not !!!

The justification for call the crappy version truly crappy was the performance concerns and there are some rather glaring performance concerns to be sure but ONLY when the crappy version was being used 1000 times more than it actually was being used.

Those who claimed the crappy version was “crappy” had to magnify the usage pattern by a minimum of 1000 times before the crappy version’s performance data might be measurable.

I agree the crappy version would be truly crappy if it were the actual source of some kind of measurable performance issue related to the loss of revenues or some other demonstrable effect that was actually causing some kind of problem.

The problem, as I saw it, had nothing to do with how crappy the code pattern may have been because let’s face it this is crappy code if it were to be used often enough for a problem to exist.

The problem, as I saw it, was a group of people all agreeing there was a problem where no problem really existed at-all simply because in their minds they were magnifying the problem by 1000 times just to be able to catch a glimpse of some kind of problem when there was no real problem at-all.

This piece of crappy code may have a real-world non-crappy use case that could have saved someone a lot of time, given the right set of circumstances related to having to maintain a huge data set by hand that had to be loaded into a list at run-time.  The desire to make this crappy-looking code non-crappy in a use case that could NEVER be actually measured as being crappy is the problem !!!  Far more time could have been spent entering all those commas and quote marks just to make the code less crappy than the effort would have been worth.

Why any person or group of people who are supposed to be intelligent talented software engineers would claim a harmless chunk of code was harmful given the actual use-case that existed in the actual source of the context for the original question which was related to a single use of the crappy version is well beyond my ability to comprehend in real terms.

The person who raised the issue was supposed to have more than 20+ yrs programming experience !!!  He found a single reference to the crappy version in a source file that was probably being used exactly once per iteration of some larger program.  WOW !!!  Talk about yelling “FIRE” in a crowded room !!!

The people who agreed with him were even more of a mystery because these people are supposed to be among the best and the brightest at this particular unnamed company and they went along with the idea that was raised by the one guy who should have known better than to yell “FIRE” in a crowded room.

It is interesting to note, these same people who were able to inflate a potentially crappy use-case beyond the original scope are the same people who are seemingly okay with all the following truly crappy coding patterns they seem to wish to do nothing about:

  • An Eager-loading ORM that maintains and uses exactly 1 Database Cursor per Session !!!
    • Why was this NEVER changed ???

I will stop here with the analysis because I think this one point bears further analysis.

How in the world do these crappy detecting software engineers allow an Eager-loading ORM to exist in the first place ???   And the company wants this sort of thing corrected !!!

I have to wonder about the skills these crappy detecting software engineers actually possess when they cannot find a way to remove the Eager-loading ORM in the first place !!!!

Removal of the Eager-loading ORM would be easy enough, for me to accomplish, but then I can tell the difference between crappy code that is really crappy versus crappy code that only seems to be crappy.

Well You See Timmy…

Now for the moral of this tale…

People who live in glass houses always seem overly eager to throw stones when their own houses have cracks in the walls so wide everyone knows there are problems.

I have no issues with people who can see imagined problems that don’t exist so long as they have their own houses in order but this was not the case in this instance.

These very same people, who seemed more than willing to detect crappy code patterns where there was no crappy code use case are the very same people who seem unwilling or unable to resolve glaring performance issues in a huge pile of code.

The rest of the issues these people could focus on are as follows:

  • mod_python rather than wsgi
  • Django not being used-all but then there is no discernible web framework being used at-all.
  • Eager-loading ORM – easy to resolve with Django.
  • Non-scalable Web App – because mod_python is being used rather than wsgi, for instance.
  • Development environment issues – all developers share a single instance of the run-time – each developer gets a different virtual host but all development being done in a single Linux instance.
    • Okay, this one truly baffles me !!!
    • How difficult can it be to get each developer their own Linux instance at a moment in time when everyone has a Cloud-based solution for doing this ?!?

Look, all these issues can be easily handled but none of them are being handled at-all.  Why ???

The reason(s) all these glaring issues are not being handled is easy… lack of experience and lack of skill in the developer community.

Nobody wants to make any real changes or nobody is able to make any real changes.

Dragging along ancient code from the deep past and then being either afraid to update it or unwilling to update it is more than ridiculous !!!

Solutions !!!

The solution for all this is also easy but very difficult to implement !!!

Rewrite your code every 18 months !!!

Better tools are being churned-out all the time.

Django is a proven Web Framework !!!

Wsgi is a proven technology stack !!!

Python+Django+tornado+wsgi+nginx equals a scalable Web App that scales as easy as you can build an automated process for spinning-up one more Linux Virtual Machine in the Cloud !!!

Or let’s put this another way…

Python+Django+tornado+wsgi+nginx was easy enough for me to handle all by my self – not that I might not have wanted to do this with a team of others – there just weren’t that many others I might have done this with.

The moment I achieved my first stable installation of Python+Django+tornado+wsgi+nginx I knew it was the way to go !!!

Python+Django runs as a separate wsgi web server with performance comparable to what you get from the Google App Engine, oddly enough, and “yes” I have run benchmarks that tell me this based on the data.

Tornado is a stand-alone Python-based Web Server with very good performance characteristics.

Tornado talks to an instance of a Python+Django web app via wsgi.

Nginx talks to an instance of Tornado that talks to an instance of a Python+Django web app via wsgi.

Why so many web servers in this stack ???

Why use Tornado at-all ???

I happen to know a little something I call Latency Decoupling that tends to make web pages serve much faster the more layers of web servers you use.

Nginx connected to many Tornado servers each connected to one or more wsgi Web Apps is far more efficient serving web content than Nginx connected directly to that very same wsgi web app.

Latency Decoupling kicks-in and your end-users have happy faces.

Ability to Scale the Web App also increases !!!

Many instance of the Web App within each Tornado instance !!!

Many Tornado instances within each Nginx instance !!!

Deployment gets easier !!!

Now with a single Python or Ant script you can spin-up yet another Amazon EC2 instance – connect-up the Nginx instances using a Load Balancer of some kind (nginx also does Load Balancing) and before you know it you have architected a really cool Django Cloud Solution that nobody else seems to have just yet.

Build a slick Control Panel for your Django Cloud Users and bingo you have the ability to grow a Web App from a single instance to any number just by clicking a button on a web page !!!

The only other detail would be how you monetize all this into something you can use to generate revenue.

All of this was easy enough to build when you have all the parts.

All of this should be easy enough for any company to use, if only they had I.T. staffers who had played around with these kinds of solutions but alas that seems to be lacking in most companies except for a few.

Too bad most would-be skilled programmers would tend to scoff at most of what’s written in this article as being some form of “crazy”… but then once upon a time the notion of generating electricity was also seen as being “crazy” along with the notion of gravity and quantum physics.  I can live with what others wish to say… so long as I get to build something really cool along the way.

This proves I have been working on the Public CargoChief Site !!!

Yes, you heard it from me first.

Whenever we get to the day when CargoChief goes public and I am worth millions (*cough* * cough*) we can all point to the day when I was working on the Public Site for CargoChief and I was working for peanuts and a big IOU, I hope.

Munin for CargoChief, seems like I have some systems level Linux skills after-all… as-if there was ever any doubt out there.

Munin for preview.cargochief.com

BTW – I rolled the back-end for this site myself using Python/Django/Tornado/nginx/wsgi running in Ubuntu 12.04 hosted at Amazon EC2.

Now you might not know by looking at me or talking to me that I know how to do all this fancy Linux magic… well, I do, and I have been doing all this fancy Linux Magic for the past going on 10+ years but then I also got started with Unix back in the day (circa. 1983) when Linux did not even exist just yet.

The nice thing about Amazon EC2 is… it’s pretty easy to whip-up an on-demand highly-scalable back-end using nothing but a bit of duct tape and some imagination when everyone else is using Puppet and going totally nuts with complexity when all along all it takes is maybe one script and some imagination along with a more than average understanding of TCP/IP. *pats himself on the back*.

 

SSH Man-in-the-Middle (Soft Hack)

A successful SSH based Man-in-the-Middle Attack might go something like this – there are a couple of assumptions but lazy or novice users will probably fall for some if not all of this.

Assumption #1

This works best for users of newly established VMs such as those one might provide via Amazon EC2.

Assumption #2

Users who simply do not pay close attention to the details will fall for much of what this attack seeks to accomplish.

Assumption #3

SSH Terminal Emulation is required by this hack.  This can be easily built using Python and Paramiko.

Assumption #4

SCP Emulation is also required to make this hack work. This can be easily built using Python and Paramiko.

Assumption #5

The end-user connects to the man-in-the-middle via SSH Terminal Emulation to begin using the newly created VM Instance via password login or a key-pair provided by the attacker – this works best when the end-user will just go along with the process of allowing someone else to do most of the work such as but not limited to allowing someone else to establish the key-pair for a seemingly newly created VM instance.

Assumption #6

Assuming the end-user chose to connect using SCP via a password, the man-in-the-middle need not even know if the provided password is correct since the SSH Terminal Emulation is just window-dressing.

Assumption #7

The SSH Terminal and SCP Emulation is all fake – I was almost sure this was already understood by most readers.  The end-user will “see” what is expected to be a newly installed Linux file system with either nothing in the .ssh folder or a key-pair that was provided for the user – the goal here is to get the end-user to accept the man-in-the-middle as legit even though it is not at-all legit.

Assumption #8

Once the end-user has either accepted the provided key-pair or uploaded a real public key the fun can begin.

Assumption #9

Now that the end-user has jumped through all the required hoops and is really using a fake SSH/SCP Terminal Session the user will make requests of the fake man-in-the-middle “proxy” (the man-in-the-middle will coax the end-user into uploading a real Public Key – the one the real end-point is using – the end-user will also be coaxed into revealing the Private Key’s Passphrase all the while thinking this makes the whole system that much more secure…); you can probably see where this is going by now.  If not, there is still some hope for your soul.

Assumption #10

The Internet is a huge house of cards surrounded by smoke and mirrors and enough assumptions to allow almost any hacker with sufficient skill to achieve whatever may be desired.  User’s have to know more than the hackers they share their Internet experiences with… most users do not.

Enjoy !!!

Ubuntu Enterprise Linux 11.04

Everyone knows RHEL (Red Hat Enterprise Linux) is all “Enterprise” just because “Enterprise” is in the name – Doh !

Now I give you Ubuntu Enterprise Linux 11.04

Method #1

This is really super-simple because as my readers know, the simpler the better even when simpler is largely overlooked by the masses just because it may be perceived to be just-too-simple.

  1. Install RHEL (any version works) in the computer of your choice, the computer cannot be older than 2009 to work. (DO NOT USE A VM)
  2. Install the latest VirtualBox version 4 or later.
  3. Create a VM using VirtuBox 4.x in RHEL.
  4. Install Ubuntu Server or Desktop 11.04 in the VM you created in Step #3.
  5. Done !
Now all I/O will flow through the Magic Unicorn OS known as Red Hat – and now Ubuntu is using Red Hat for everything one might want to use Red Hat for.
Just in case some of you are reaching for your phones to call the nearest Asylum to have little ole me admitted on a 72 hour administrative hold… LOL  Keep in mind this is being done by your nearest Citrix Xen Server 5.6 because it too uses RHEL as the host OS in which you will be expected to run your Guest OS in a VM – hehe.
I actually proposed this to a Manager I have been working with but the idea was not embraced, probably because it just makes too damned much sense especially if the product you are managing runs in Debian but your I.T. support people are telling you they only want to support RHEL because it is the blessed OS for the Enterprise.
This same technique works great for any guest OS – Windows in Red Hat and other variations.

Method #2

Install apt-get in Red Hat – don’t laugh this has been done – google it.
At the end of the day, Linux code is Linux code and works in any Linux – all roads lead to the same China where Linux is concerned.  Even those who say they are adding special powers to their favorite Linux (RHEL) are really doing very little other than branding the same Linux everyone else is using for their own use.  If any Linux were gonna have Magic Unicorn Powers it would surely be RHEL because it says it is “Enterprise” and it should surely be measurable better than the rest but this just ain’t the case, in real terms.
Slap apt-get into RHEL and you get the best of Ubuntu in RHEL without Ubuntu.

Conclusions

Crazy ideas only seem “crazy” until they catch on with the masses.
Not all that long ago we all might have scoffed at the idea of using Virtual Machines rather than real computers until we learned just what the Cost of Operation is for a real server – deploy real servers by the thousands and then get ready to buy your own small power plant because that’s exactly what you will need every time you see the electric bill.
The idea of running a VM inside a host OS is nothing more than yet another way to achieve the same goal as VMWare for ESX and other products you have to spend real money to get.
Cheers.

MySQL for BigData

If all you have to work with is MySQL but you have PetaBytes to store… you could be in trouble unless… you happen to be me…

Assumption #1

Relational databases love executing really small SQL Statements.

Assumption #2

Relational databases do NOT have to use any relational features.

Assumption #3

Networked Object-Oriented data models are very efficient when all you have to work with is a Relational Db as the data management platform.

Assumption #4

BigData solutions tend to use really big heaps of key/value storage systems because the data can be spread-out over a large number of modes easily.

Assumption #5

Many instances of MySQL can execute the same query faster than a single instance because the distributed query can be executed in parallel.

Assumption #6

Forget everything you ever thought you knew about how to cluster MySQL because all that crap won’t help you when you have PetaBytes to store and manage efficiently.

Solution #1

Store your BigData in many instances of MySQL (think 10’s or 100’s) using a Networked Object-Oriented Data Model where key/value pairs are linked to form objects using nothing but Metadata in the form of key/value pairs while spreading the data out to all available MySQL nodes and then execute the SQL required to retrieve Collections of Objects in parallel and MySQL can be nice and fast for BigData.

Caveat #1

Do you know what is meant by “Networked Object-Oriented Data Model” ?!?  Probably not but this gives you something to figure-out while looking for all those cheap computers you will use to form your MySQL Network.

Caveat #2

Do you know what is meant by “executing the same SQL Statement in Parallel” ?!?  Probably not but this gives you something to figure-out while you think about the prior Caveats.

Caveat #3

Do you know the process of fetching data from all those MySQL Instances can be done using a single SQL Statement ?!?  Probably not, but then you probably forgot to read-over and understand Assumption #6 from above.  Think about Collections of Objects more than Rows of Data.

Caveat #4

Keep it super-simple.  Super-Simple runs faster than the other thing.

Computers are really stupid but can be fast.

Stupid requires simple.

Simple is FAST.

BigData is FAST when the solution is parallel but stupid simple.

Caveat #5

Try to optimize each MySQL Instance by increasing the available RAM to a minimum of 4 GB per instance using 32-bit MySQL running in a 32-bit Linux OS but use VmWare Workstation to run each instance using a separate CPU Core with a minimum of 1 VmWare Workstation Instance per CPU Core.  Unless you can find a MySQL Implementation that automatically uses multiple cores and then you have to give some serious thought to how to make all them MySQL Instances execute the same SQL Statements in parallel – better think about this one for a while… I already know how to do this but you might not.

 

HADOOP Optimization Technique #1

HADOOP is slow !

BigData should be FAST !

Single Server installations for HADOOP tend to want to use the entire multi-core CPU for one single HADOOP instance.

Assumption #1

The Java JVM has NOT been optimized for multiple cores for anything other than garbage collection when one uses an out of the box JRE.

Assumption #2

The HADOOP has NOT been optimized for multiple cores for anything other than garbage collection based on Assumption #1.

Assumption #3

Most servers HADOOP might run on probably have multiple cores especially when Intel or AMD chips are being used due to the need to keep Moore’s Law alive in a Universe where the upper bound for CPU performance is the RAM bus speed.

Assumption #4

VmWare Workstation Appliances can be run each using a separate core when the host OS is Windows Server 2008 R2.

Assumption #5

VmWare Workstation Appliance Instances will be run at the HIGH Priority setting (one level below Real-time for Windows Server 2008 R2).

Assumption #6

VmWare Workstation Appliance Instances will be given 4 GB RAM using 32-bit HADOOP in a 32-bit Linux OS; all software being used is 32-bit.  No 64-bit code will be used.

Possible Solution #1

If the server has 4 cores when run 4 instances of HADOOP each in a separate VmWare Appliance where each VmWare Workstation instance is dedicated to one of the available cores.

Scale for the number of cores.

Continue packing-in separate VmWare Instances using VmWare Workstation until the aggregate performance begins to degrade and then use empirical performance data to determine the optimal configuration.

Caveat #1

Solution #1 has not yet been tried however based on the available information it should produce better performance for HADOOP and/or Java in general.

 

%d bloggers like this: