Just for fun !!!

So today, just for fun, I had a wild idea !!!

Windows Web Server 2008 hosting IIS 7 running Python 2.7 + Tornado + Django 1.3 doing the same thing I was able to achieve with Ubuntu + Python 2.7 + Django + wsgi + Tornado + nginx !!!

I doubt the Windows performance will match that of Ubuntu however this has been all kinds of fun !!!

See also:  Running Django on Windows (with performance tests) !!!

I was originally interested in doing Django 1.3 with Windows Web Server 2008 and IIS 7 so what I found was pretty darned cool.

Point and click installations for all this stuff was nothing less than amazing !!!  Especially for Windows !!!  WTG Microsoft !!!

What I really want is WebDav so I can share my huge pile of files with myself and only myself via the Internet and WebDav seems to do the trick…  problem is Windows Web Server 2008 doesn’t seem to know how to do WebDav so I have to improvise a bit by making this do the trick as follows:

Tornado does wsgi !!!

Now IIS 7 does Tornado !!!

Easy as 1,2,3 !!!

Oh, and this puts me one step closer to having my own Private Cloud in my home !!!

 

 

 

Advertisements

Why does “crappy” code not perform worse than it seems to ?

I like good code like the next person however I am wondering why people spend so much time focusing on “crappy” code patterns that don’t seem to perform worse than expected ?!?

Here’s a thought:  If you want to complain about “crappy” code then why not fix all the crappy code you become aware of including all that stuff you cannot seem to understand !!!

Or maybe just realize the fact that your idea of “crappy” code may be another man’s treasure !!!   Oh wait, someone already thought of this when the following was coined, “…one man’s trash is another man’s treasure…”. Circa. 1570’s !!!

Or to put this another way…   those who feel some coding patterns may be “crappy” might just be limiting their own ability to perceive really useful code they would surely have overlooked !!!

I am NOT in favor of truly crappy code !!!

But, I am also NOT in favor of being so close-minded that I cannot see how useful alternative coding styles may be.

Single instances of “crappy” code cannot be crappy !!!

If you have to produce 1000’s or 1,000,000’s of iterations to see a performance problem from a single use chunk of code then there is very little reason to think about any such performance problem !!!

I might be impressed with those who rail against “crappy” code who also make darned sure all the code they can see in their field of vision is not also “crappy” code !!!

Scenario #1

For instance, consider the following pattern that was recently classified as being “crappy” by an intrepid band of hearty Python coders in an company we dare not name here…

This is “crappy” !

toks = ‘one two three four five six seven eight nine ten’.split()

This is not !

toks = [‘one’, ‘two’, ‘three’, ‘four’, ‘five’, ‘six’, ‘seven’, ‘eight’, ‘nine’, ‘ten’]

The crappy version was said to be “crappy” because the split was seen as being unnecessary because the non-crappy version was said to be the desired effect.

Those who said the crappy version was crappy have probably never had to load a list from a comma delimited jungle of data they may have had to spend several days entering manually by hand otherwise they may have gotten themselves some useful experience as to why the crappy version might be somewhat less crappy after-all.

Scenario #1 Benchmarks

I like benchmarks.  I like to know when code may perform badly at run-time at scale.

The crappy version for Scenario #1 runs 5x slower than the non-crappy version for 1,000 iterations.

The crappy version for Scenario #1 runs 7.6x slower than the non-crappy version for 10,000 iterations.

The crappy version for Scenario #1 runs 7.65x slower than the non-crappy version for 100,000 iterations.

The crappy version for Scenario #1 runs 7.66x slower than the non-crappy version for 1,000,000 iterations.

Um, if you turn-off the Python GC the performance issues seem to disappear for a while !!!  Just a thought…

Scenario #1 Analysis

The real question is this:  “Was the crappy code being used enough for the comments this code pattern elicited ?”  Probably not !!!

The justification for call the crappy version truly crappy was the performance concerns and there are some rather glaring performance concerns to be sure but ONLY when the crappy version was being used 1000 times more than it actually was being used.

Those who claimed the crappy version was “crappy” had to magnify the usage pattern by a minimum of 1000 times before the crappy version’s performance data might be measurable.

I agree the crappy version would be truly crappy if it were the actual source of some kind of measurable performance issue related to the loss of revenues or some other demonstrable effect that was actually causing some kind of problem.

The problem, as I saw it, had nothing to do with how crappy the code pattern may have been because let’s face it this is crappy code if it were to be used often enough for a problem to exist.

The problem, as I saw it, was a group of people all agreeing there was a problem where no problem really existed at-all simply because in their minds they were magnifying the problem by 1000 times just to be able to catch a glimpse of some kind of problem when there was no real problem at-all.

This piece of crappy code may have a real-world non-crappy use case that could have saved someone a lot of time, given the right set of circumstances related to having to maintain a huge data set by hand that had to be loaded into a list at run-time.  The desire to make this crappy-looking code non-crappy in a use case that could NEVER be actually measured as being crappy is the problem !!!  Far more time could have been spent entering all those commas and quote marks just to make the code less crappy than the effort would have been worth.

Why any person or group of people who are supposed to be intelligent talented software engineers would claim a harmless chunk of code was harmful given the actual use-case that existed in the actual source of the context for the original question which was related to a single use of the crappy version is well beyond my ability to comprehend in real terms.

The person who raised the issue was supposed to have more than 20+ yrs programming experience !!!  He found a single reference to the crappy version in a source file that was probably being used exactly once per iteration of some larger program.  WOW !!!  Talk about yelling “FIRE” in a crowded room !!!

The people who agreed with him were even more of a mystery because these people are supposed to be among the best and the brightest at this particular unnamed company and they went along with the idea that was raised by the one guy who should have known better than to yell “FIRE” in a crowded room.

It is interesting to note, these same people who were able to inflate a potentially crappy use-case beyond the original scope are the same people who are seemingly okay with all the following truly crappy coding patterns they seem to wish to do nothing about:

  • An Eager-loading ORM that maintains and uses exactly 1 Database Cursor per Session !!!
    • Why was this NEVER changed ???

I will stop here with the analysis because I think this one point bears further analysis.

How in the world do these crappy detecting software engineers allow an Eager-loading ORM to exist in the first place ???   And the company wants this sort of thing corrected !!!

I have to wonder about the skills these crappy detecting software engineers actually possess when they cannot find a way to remove the Eager-loading ORM in the first place !!!!

Removal of the Eager-loading ORM would be easy enough, for me to accomplish, but then I can tell the difference between crappy code that is really crappy versus crappy code that only seems to be crappy.

Well You See Timmy…

Now for the moral of this tale…

People who live in glass houses always seem overly eager to throw stones when their own houses have cracks in the walls so wide everyone knows there are problems.

I have no issues with people who can see imagined problems that don’t exist so long as they have their own houses in order but this was not the case in this instance.

These very same people, who seemed more than willing to detect crappy code patterns where there was no crappy code use case are the very same people who seem unwilling or unable to resolve glaring performance issues in a huge pile of code.

The rest of the issues these people could focus on are as follows:

  • mod_python rather than wsgi
  • Django not being used-all but then there is no discernible web framework being used at-all.
  • Eager-loading ORM – easy to resolve with Django.
  • Non-scalable Web App – because mod_python is being used rather than wsgi, for instance.
  • Development environment issues – all developers share a single instance of the run-time – each developer gets a different virtual host but all development being done in a single Linux instance.
    • Okay, this one truly baffles me !!!
    • How difficult can it be to get each developer their own Linux instance at a moment in time when everyone has a Cloud-based solution for doing this ?!?

Look, all these issues can be easily handled but none of them are being handled at-all.  Why ???

The reason(s) all these glaring issues are not being handled is easy… lack of experience and lack of skill in the developer community.

Nobody wants to make any real changes or nobody is able to make any real changes.

Dragging along ancient code from the deep past and then being either afraid to update it or unwilling to update it is more than ridiculous !!!

Solutions !!!

The solution for all this is also easy but very difficult to implement !!!

Rewrite your code every 18 months !!!

Better tools are being churned-out all the time.

Django is a proven Web Framework !!!

Wsgi is a proven technology stack !!!

Python+Django+tornado+wsgi+nginx equals a scalable Web App that scales as easy as you can build an automated process for spinning-up one more Linux Virtual Machine in the Cloud !!!

Or let’s put this another way…

Python+Django+tornado+wsgi+nginx was easy enough for me to handle all by my self – not that I might not have wanted to do this with a team of others – there just weren’t that many others I might have done this with.

The moment I achieved my first stable installation of Python+Django+tornado+wsgi+nginx I knew it was the way to go !!!

Python+Django runs as a separate wsgi web server with performance comparable to what you get from the Google App Engine, oddly enough, and “yes” I have run benchmarks that tell me this based on the data.

Tornado is a stand-alone Python-based Web Server with very good performance characteristics.

Tornado talks to an instance of a Python+Django web app via wsgi.

Nginx talks to an instance of Tornado that talks to an instance of a Python+Django web app via wsgi.

Why so many web servers in this stack ???

Why use Tornado at-all ???

I happen to know a little something I call Latency Decoupling that tends to make web pages serve much faster the more layers of web servers you use.

Nginx connected to many Tornado servers each connected to one or more wsgi Web Apps is far more efficient serving web content than Nginx connected directly to that very same wsgi web app.

Latency Decoupling kicks-in and your end-users have happy faces.

Ability to Scale the Web App also increases !!!

Many instance of the Web App within each Tornado instance !!!

Many Tornado instances within each Nginx instance !!!

Deployment gets easier !!!

Now with a single Python or Ant script you can spin-up yet another Amazon EC2 instance – connect-up the Nginx instances using a Load Balancer of some kind (nginx also does Load Balancing) and before you know it you have architected a really cool Django Cloud Solution that nobody else seems to have just yet.

Build a slick Control Panel for your Django Cloud Users and bingo you have the ability to grow a Web App from a single instance to any number just by clicking a button on a web page !!!

The only other detail would be how you monetize all this into something you can use to generate revenue.

All of this was easy enough to build when you have all the parts.

All of this should be easy enough for any company to use, if only they had I.T. staffers who had played around with these kinds of solutions but alas that seems to be lacking in most companies except for a few.

Too bad most would-be skilled programmers would tend to scoff at most of what’s written in this article as being some form of “crazy”… but then once upon a time the notion of generating electricity was also seen as being “crazy” along with the notion of gravity and quantum physics.  I can live with what others wish to say… so long as I get to build something really cool along the way.

Node.js Achilles Heel

The good thing about Node.js is all that JavaScript running on my server !!!

The bad thing about Node.js is all that JavaScript running on my server !!!

Hey, if you love JavaScript then by all means use the heck out of Node.js; you may come to realize why Node.js is so weak as a server-side technology.  Heck, almost nobody even cares why Ruby on Rails is so weak and it’s got millions of followers.

Node.js lacks a threading model !!!

So who cares, Ruby lacks a useful threading model too and nobody cares about it at-all.

Ok, to be fair, Ruby 1.9.x does use multiple-threads but Ruby 1.8.x does not. (See also: this)

Node.js has no threading support at-all because JavaScript has no threading support.

As a casual user, or typical Manager, you will never even know or care about the lack of a threading model… none of your developers will either, for that matter.

On the other hand, if you ever try to develop some swift Node.js gizmo that could benefit from a threading model, well let’s just say you will be stuck with a slow service.

The good news is, even without a threading model Node,js could be used effectively but sadly few of your under 30 coders will probably know anything about how to engineer around the lack of a threading model.

Enjoy the lack of threads… Ruby lovers have for a number of years and you don’t hear anything from them about this either way.

Addendum – threading model

So there kind-of is a threading model for Node.js but only in the grossest manner – so you can fork or spawn a process – big deal !!!   This is not the same thing as spinning-up a thread by any stretch of the imagination.

Node.js is too immature for prime time

You should be able to see the break-out Node.js Web Framework on-par with Django for Python but there only seem to be a ton of choices with no clear winner.

The bottom line is, Node.js is all the rage, this year.  But will Node.js be there next year and the year after with the same strong following once people figure-out just how Node.js works ?!?

Here’s what you can look forward to with Node.js

You will look at Node.js and fall madly in love, no doubt.

Node.js can be very fast but only when you use it sparingly and then only for lightweight processes.

Node.js cannot do any heavy lifting because it lacks a threading model which means you will be spinning-up Processes not Threads and as we all should know a Process object is very heavy, much heavier than a Thread object.

Node.js will be a bit more of a pain to scale unless you want to buy/rent/lease a single server for each Node.js process and then you can go ahead and throw all your money into Node.js servers with my blessing; I will be spending much less on my Python servers and not only because I can pile more services on each Python server…

Node.js is this year’s Ruby on Rails.  *yawn*  So what else is new ?!?

Node.js appeals to non-techies and so does Ruby.

Node.js can be useful but only for those who know how to leverage it properly.

Use Node.js for simple one-off web services and keep Python around for the heavy lifting.

The bottom line

JavaScript is JavaScript no matter how much lipstick you pile on.  JavaScript was designed for the browser.  Get over it already !!!  Yes, you can run JavaScript on your servers – big deal !!!  I can run the Chrome browser on my servers too, does this automatically mean I should be using Google Chrome for my web services ?  I mean really !!!

What’s next ?  Let’s run Adobe AIR servers ?!?   Oh, no, I forgot everybody is supposed to hate Adobe, right ?!?

Let’s keep JavaScript running in our browsers…  Servers are for serious system-level work and this is why god made Python and Stackless Python anyway.

Automatic SEO Optimization –> Test Drive it today !!!!

When Any and All URLs based on the CargoChief.Com domain leads the end-user to the CargoChief site you are free to explore SEO (Search Engine Optimization) simply by using whatever URLs you wish to use such as the following:

http://www.cargochief.com/book/your/next/shipment/through/us/

or

http://www.cargochief.com/helps/you/connect/with/shippers/

or

http://www.cargochief.com/gets/your/pallets/shipped/today/

or

Whatever your marketing people might wish to use… you can feel free to be as creative about your message and how your URLs are positioned in whatever search engines you may wish to use…

Yet another useful innovation you may wish to begin using today.

Go ahead and give it a try…  the URL you bookmark will be the SEO-friendly URL rather than the actual end-point even when the actual end-point takes you to the CargoChief Cloud.

Only works with http://www.CargoChief.Com or CargoChief.Com domains.

Never see another 404 page again !!!

Brought to you by Vyper Logix Corp, innovation for the 21st Century ad beyond.

How can you tell when a software algorithm can be parallelized ?!?

Parallelization is a pet project of mine and has been for many years…

Recently while interviewing at Paypal, who by the way does NOT hire talent and especially NOT talent who has a conscience… more on Paypal later…

Parallelization…

Algorithms that seek to collect data can be more easily parallelized than the other kind.

This means Analytics Application are ripe for parallelization because Analytics is all about collecting aggregations from raw data.

How to…

All you need is a bit of experience with TCP/IP in the form of connecting a Process running on one computer to a Process running on another computer – repeat this as often as possible and you too can create a network of Processing Nodes each of which is connected to the rest – the real magic lies in how you achieve this goal in an efficient manner.

Next, you need to find a way to ask each Processing Node to perform part of the Aggregation Process using one chunk of data each.

Then find a way to get each Processing Node to make the request for its chunk of data at the same time, or as close to the same time as possible.

Then find a way to collect the results from each Processing Node in a parallel manner – again this is where you will find some magic.

HINT: TCP/IP is useful because it allows many requests to be made at the same time while allowing many results to be collected very quickly.

Before you know it you have the framework for parallelization but again there is ample room for doing some magic within the framework.

HINT: Python is easier because Stackless Python combined with multi-threading combined with TCP/IP combined with Parallel Python and Cython and Psycho… can’t get this much bang for the buck from Ruby but you can surely try if you must.

HINT: If you are me then you are more able to use your experience to make your typical non-aggregating algorithm into one that does perform some kind of aggregation… 😉

Paypal’s Interviewing Practices and the lack of ethics

Once upon a time I interviewed with a guy at Paypal who asked me to code a datetime object using an unspecified language who also did not stop to consider his goal had already been accomplished in just about every computer language he or I might wish to use… I mentioned this during the interview and he failed to respond with anything other than his lack of desire to continue… My question was driven by my desire to be ethical since I consider it unethical to write code where an existing solution has already been accomplished and I wanted to know “why” I was being asked to violate my own sense of ethics – not that I am opposed to violating my own sense of ethics so long as I can respond to the questions I might be asked to give later during the deposition in case the request results in legal action as most ethical breaches will eventually.  I have to wonder about professionals who have not spent any of their time thinking about what may or may not be ethical behavior.

Why “ethics” ?

Because I am a professional software engineer who, from time to time, write code by the hour and if I were to spend billable time (as would be the case for Paypal in this instance) working on code I knew had already been written I would feel as-though I was taking money under false pretenses and this would be a breach of ethics for me – not that this guy at Paypal would feel the same because he apparently had no ethical problems with asking me to work on code I would not otherwise have to build.  Maybe if those who ask contractors to work would also stand behind their requests someone like me would not need to develop a sense of ethics… think about it because I have.

Realistic Interviews !?!

 I think professional interviews should be 100% reflective of real life which means if I were actually asked by some manager to work on some kind of datetime object and I actually did so rather than using one that already exists I would hope I would be considered to be less than honest because aren’t I supposed to know what software objects already exist versus the ones that don’t ?!?
Consider the other side of this coin… what if I approached my manager at Paypal and asked him or her for permission to work on something and the something I asked to work-on already exists as some kind of open source thing I could have much more easily used ?!?   Would I have to not be completely dishonest if I were being paid by the hour to spend billable time working on something that already exists ?!?  I would think so but then I have a sense of ethics and a desire to not waste my professional time.
Interviews should be as realistic as possible in terms of professional expectations and professional ethics.
Nonetheless… I will remain as ethical as I can be even when it takes money out of my pocket because this is the right thing to do !!!
And those who lack ethics are generally punished by the legal system sooner or later…

Ruby 1.9.2 uses Real OS Threads !!!

Yeah – Rejoice Ruby-nauts !!!

Ruby 1.9.2 uses Real OS Threads !!!

Here’s the Code !!!

 

Here’s the Proof in Living HD Color !!!

Ruby 1.8.7 does NOT use Real OS Threads but Python does !!! Proof !!!

Proof that Ruby 1.8.7 does NOT use Real OS Threads !!!

How many times have I heard someone tell me Ruby 1.8.7 uses Real OS Threads and always has from the beginning ?!?

Far too many times…

You see, some people are so deep in denial they want Ruby to be what it is very clearly not and cannot be.  They proclaim Ruby can do everything it cannot do like use Real OS Threads by default without so much as taking the time to give it a real objective test.  Here is a real objective test.  I have no vested interest in either Ruby or Python, my only interest is in testing to see which is using Real OS Threads.

Ruby uses green threads which is to say Ruby simulates the use of Threads but it does NOT use Real OS Threads by default. There are some Ruby Gems out there that do allow Real OS Threads to be used but Ruby itself by default does NOT use Real OS Threads.  If Ruby did use Real OS Threads one would see the Real OS Threads when one views the process at runtime.

Python 2.5.x does use Real OS Threads because you can see the number of threads appear when viewing the process at runtime.

You cannot fool the OS.  When the OS knows there are Real OS Threads they show-up otherwise nothing but a single thread shows-up.

Run the tests. Write some code. Look at the OS Process.  See the Proof. It’s just that easy !  In the meantime, don’t try to talk smack to me unless you have actually seen the proof for yourself !!!

Look, if you want to prove me wrong then prove me wrong but do it with code I can run in my own Ruby right here with my own computer(s) or don’t bother to tell me how wrong I might be… On the other hand, if it helps me get a job and earn some money then I don’t mind letting certain select people tell me how wrong I am while I am taking their money – after I have cashed all the checks I will be right back here testing to see who’s right and who’s wrong.  So far Ruby keeps telling me how much it don’t use Real OS Threads…  and I am not the only person saying this even though people keep telling me just how wrong I am.

BTW, I am NOT the one who is wasting his money on Ruby… When it’s my money I don’t throw it away trying to make-believe Ruby is what Ruby cannot be.  I know exactly what Ruby is and how it works.

So here’s the Proof !

The Ruby Code:

The Python Code:

The OS:

  • iMac Core i7 running OS/X 10.7 Lion
  • Ruby 1.8.7
  • Python 2.5.2 Stackless 3.1b3 060516 (python-2.52:61022, Feb 27 2008, 16:52:03)
  • [GCC 4.0.1 (Apple Computer, Inc. build 5341)]

The Proof

Conclusion

Some people hear the work “multi-threaded” and they fail to look any deeper into the issue because they will concluse, “…I can use threads therefore it must be multi-threaded…”.  It is possible for Ruby 1.8.7 to be single-threaded from the OS Process perspective while being mutli-threaded from the perspective of Ruby within the larger context of a single OS Thread.  Everything that happens in Ruby 1.8.7 that does not specifically use real os threads will use simulated green threads that are not real os threads.

%d bloggers like this: