Fundamental Rules of Performance
Eric Gunnerson had a good post on performance here. He is addessing the specific issue of the performance of generics, but lays out general principles:
(1) But for the majority of applications, this isn't the case. Performance is rarely dominated by small decisions, such as whether you use an ArrayList or a List<int> to store your data. It's usually dominated by algorithmic concerns - how you process data, how much redundant processing you're doing, etc.
(2) Note that I'm not saying that you should put performance off until the end - that rarely works, as time at the end is fairly precious. I think you should focus on macro performance rather than micro performance.
I would add a third fundamental rule, one that I learned many eons ago from the classic book The Elements of Programming Style by Kernighan and Plauger and have verified from many years of experience since then:
(3) Make it right before you make it faster.
Let me give you a classic example of this rule, from a real-world project. Several years ago I had a project to develop an interactive development environment for an industrial robot. This included an environment for writing and managing robotic programs written in a proprietary language, as well as a compiler for that langauage and a real-time debugger that executed the program and communicated with the robot controller (eg breakpoints). Of course, I wrote 99% of this application in VB6 :-)
Yes, I wrote a real compiler in VB6. Of course, I knew that compilers are very string intensive, and that the performance of string manipulations wasn't exactly VB6's strongest feature. However, I also knew that developing the compiler and verifying its correctness was the real challenge here and I figured I could rewrite the string handling portions of it in C, once it was all working correctly.
Of course, that day never came. The performance of the VB version was just fine and the client was thrilled to have a fully-functional version that much sooner. If I did need to improve performance, rewriting that part in C would have been pretty trivial and the compiler was already proven to be correct.