|
JavaOne Slides Slides for my 2009 JavaOne talks:
And of course shortly after JavaOne, I went to 2009 ECOOP in Italy for a week, and now I'm on a long deserved vacation. More blogging soon, on ECOOP at least, Category: Web/Tech | | TrackBack (0) TrackBackTrackBack URL for this entry: Listed below are links to weblogs that reference JavaOne Slides: CommentsHi Cliff I saw the slides for your benchmarking talk a few weeks back (I think from the JavaOne presentations site)--one thing that I missed was what you, as longtime JVM engineer, use to benchmark, e.g. what benchmarks you trust and which lead you to believe you have improved the code, are running as fast as/faster than another language, etc. Mostly you covered problems with existing, public benchmarks, but I'm sure you must have some way to verify your own expectations. Maybe you could blog about this at some point.
Regards Posted by: Patrick | Jul 21, 2009 9:36:56 AM STM is not 15 years old --- the first STM systems appeared in 2005 --- off by a factor of 3 :-). Posted by: Maurice Herlihy | Jul 21, 2009 6:03:30 PM What about this one from 1995? Nir Shavit and Dan Touitou. Software Transactional Memory. Proceedings of the 14th ACM Symposium on Principles of Distributed Computing, pp.204–213. August 1995. Cliff Posted by: Cliff Click | Jul 21, 2009 6:33:30 PM Ah, but the 1995 Shavit & Touitou STM was a design only; there was no implementation. It was a PODC (theory) paper, intended to show it was *possible* to implement TM entirely in software (which seems obvious now but wasn't then). The actual algorithm required predeclaring write sets, and had other issues that did not make it a promising candidate for implementation. AFAIK, the first actual STM implementation was DSTM in 2005. Even HTM received little or no attention until after 2000. I recently prepared a graph of citation counts for a talk, and turns out 2000 was the first year the 1993 TM paper broke through the 10 citation per year barrier. (It didn't break 100 per year until 2006.) So TM, both hardware and software, is much less mature than you would think from the standard citations. If we had written the 1993 paper in 2001, not much would have changed. Anyway, I am enjoying your slides and reports: carry on! Maurice Posted by: Maurice Herlihy | Jul 21, 2009 9:34:54 PM Good read. Could you elaborate on the following quotes? * Fixnums: "Setting final fields cost a memory fence" Thanks. Posted by: Nils | Jul 23, 2009 6:36:02 AM * Fixnums: "Setting final fields cost a memory fence" - * Lessons: "Language/bytecode mismatch – can't assume, e.g. Uber GC or Uber Escape Analysis, or subtle final-field semantics" It's a chicken-and-egg type problem. No important programs churn Integers, so nobody tries to optimize them. Do you dive in an try to make an 'egg' and hope a JVM 'chicken' will appear that Escape-Analysis's away the 90% cost? Cliff Posted by: Cliff Click | Jul 23, 2009 7:21:15 AM Just a few cents.. I am less aware of specific pitfalls that relate to Java, but tend to be more concerned with the more generic ones. Maybe some of this may help,,, * First, how many tasks are you running and is that number greater than or close to the number of cores in the system,, There was once a saying that said if a process was EQUALLY split into 2 threads and one thread went Infinitely faster, you would only halve your execution time.. Posted by: Mitchell Erblich | Aug 4, 2009 2:28:43 AM I'm not sure which slides you are referring too! :-) * First, how many tasks are you running and is that number greater than or close to the number of cores in the system,
Cliff Posted by: Cliff Click | Aug 4, 2009 8:48:09 AM Cliff- I just watched an online video of the "This Is Not Your Father's Von Neumann Machine" talk you gave at the Sun 2009 JVM Language Summit ( http://www.infoq.com/presentations/click-crash-course-modern-hardware ). Excellent talk. Thanks for sharing it. The slides appear to be the same as the JavaOne talk of the same name linked above, which is why I'm commenting here, even though I didn't see the JavaOne talk. I have a question about the "real chips reorder stuff" example you gave. The slides don't list which architecture(s) the example applies to, but in the video I think you referred to x86. I'm trying to reconcile this with my understanding of the x86 memory model, which I thought would prevent the kind of reordering you describe, at least under normal circumstances (i.e. using write-back memory and no fancy instructions). I went back to the "Intel 64 and IA-32 Architectures Software Developer's Manual Volume 3A" ( http://www.intel.com/Assets/PDF/manual/253668.pdf ) and found section 8.2.3.2, titled "Neither Loads Nor Stores Are Reordered with Like Operations" (page 323). This example, at least as I read it, is a simplified version of the one presented in your talk, but it reaches the opposite conclusion: that you are guaranteed to never get into a state where the second write done by CPU #0 is seen by CPU #1 but not the first. Am I misreading the Intel document, or does your example only apply to other chips with weaker consistency guarantees, like the ia64? Thanks! Posted by: Dave Clausen | Jan 15, 2010 4:34:06 PM X86 has a very strong hardware memory model, but it still allows younger loads to bypass older stores. You get other racey behaviors than what I demo'd; I stuck with an easy-to-understand example. That particular re-ordering in the slides is possible on IA64, Alpha, & Azul gear, plus a bunch of higher-end DSP-like things. Keeping the strong X86 memory model costs Intel something; if they dropped the strong ordering they totally could make a faster X86 - but people would then have to deal with the O-O-O'ness of it. Cliff Posted by: Cliff Click | Jan 15, 2010 6:52:16 PM Post a comment |


