Mac chat


chest
 Share

Recommended Posts

  • Replies 22.6k
  • Created
  • Last Reply

Top Posters In This Topic

Beccatevi questa lunga lettera estrapolata da piu' missive di questo simpatico signore che lavora alla Nasa e presa dal sito di Ric Ford.

Interessante soprattutto per Chest (Posted Image)

Mi sembra molto interessante per vari motivi:

lavora in un ambiente misto, lavora sfruttando a fondo i G4, e' in grado di comprendere svantaggi e vantaggi di entrambe le le piattaforme hardware)

---

Craig Hunter . NASA

G4 Performance

I have been following the discussion of Rob Galbraith's benchmarks with much interest, as I have spent a good deal of time testing, optimizing, and benchmarking software for the G4 (OS X) and P4 (Linux).

The first thing to realize is that there are numerous benchmarks that show the P4 is faster, and there are numerous benchmarks that show the G4 is faster. What matters? Well, probably the benchmarks that apply to the kind of work you do. For people doing photo processing with the software Rob tested, his results are extremely relevant. But, someone working with a program optimized for AltiVec and dual processors might have a completely opposite experience.

Just to give an example of a benchmark that goes the other way, see this chart.

http://members.cox.net/craig.hunter/jet3d.jpg

(You're welcome to mirror this benchmark image, since my web site may not handle a lot of traffic). These real-world results come from the Jet3D computational fluid dynamics noise prediction software, which I developed for my doctoral thesis and currently use in my work at NASA. Jet3D is written in a combination of FORTRAN 77, FORTRAN 90, and C, and is optimized for AltiVec and dual processors on G4 hardware. When compiled on Linux using Intel's ifc compiler tools, Jet3D also becomes optimized for the P4 (using the various SIMD extensions available on the P4).

As you can see, the G4 does quite well here. A dual processor 1.25GHz G4 system is more than 3.5X faster than a single processor 2GHz P4 system. Though it's not shown on the chart, a single 1.25GHz G4 processor benchmarks at about 1589 MFLOPS, 1.9X faster than the P4. If you look at MFLOPS per MHz for a single processor, the G4 comes in at 1.27 MFLOPS/MHz, while the P4 comes in at 0.42 MFLOPS/MHz. If you want a good example of the MHz myth, look at the Cray, which comes in at 1.78 MFLOPS/MHz with only a 500MHz processor, beating both the G4 and P4.

Without AltiVec, the Jet3D benchmark would be about 794 MFLOPS on the dual-1.25GHz G4, which erases the performance lead over the P4. And then, using only a single processor, the 1.25GHz G4 benchmarks at about 418 MFLOPS, which is about half as fast as the P4. And all of a sudden, the G4 doesn't look very compelling. For the Jet3D benchmark, AltiVec and dual processors are key (AltiVec more so than dual procs). This is true for most benchmarks I have looked at; thus numerically intensive applications that can't use AltiVec and/or dual processors are likely to suffer on the G4.

In the case of Jet3D, it was easy to optimize for AltiVec. I was able to hand-vectorize about 10 lines of code within the guts of the FORTRAN algorithm and convert the computations to C for easy access to AltiVec hardware instructions. It had a huge effect for not a lot of work. For other more complicated cases, it may be possible to use the VAST compiler tools to automatically vectorize and tie in with AltiVec (VAST has parallel tools also). But in some cases, vectorization is not possible or feasible. In those instances, you're stuck with the processor's scalar performance, and the P4 generally has better scalar performance than the G4 in my experience. One final note: these are my personal views, and do not represent the views of NASA Langley Research Center, NASA, or the United States Government, nor do they constitute an endorsement by NASA Langley Research Center, NASA, or the United States Government.

I test the machines that are common in my world of science/engineering, and right now, that means P4/Linux systems and G4/OSX systems. Two years ago, I would also have included various legacy UNIX workstation systems in there, and then you might have 5-6 different operating systems being compared. It's just not possible to do academic benchmarks here; I work in a mixed environment.

I have done a good deal of benchmarking using Yellow Dog Linux and Black Lab Linux on G4 hardware in the past. Linux is certainly much leaner than OS X, and I currently feel that Linux is preferable for clustering applications. But the performance differences between Linux and OS X on G4 hardware are small compared to the performance differences between G4 hardware and P4 hardware. In cases where the P4 is significantly faster, using OS X or Linux on the G4 won't make much of a difference. Similarly, in cases where the G4 is the compelling choice for speed, it's less of an issue which OS is used. Many benchmarks and tests I run are maxing out various parts of the hardware architecture (bus, memory, etc...) and there's not a whole lot that the OS can do to help or hurt that. Both Linux and OS X are able to get out of the way when running dedicated computations in single user mode (no GUI) and that's what counts.

Actually, Jet3D is about 99% double precision. I was able to reformulate a "key" vector algorithm into single precision to take advantage of AltiVec by properly non-dimensionalizing the computation and using some common sense. The rest of the code remains double precision as required (plus, there would be little to gain from converting the rest of the code to use AltiVec anyhow, never mind that it's impractical).

A lot of us in science and engineering use codes that were written in double precision because it's required, but an equal number of codes use double precision and don't really need it (or only need it in certain parts of the code). I believe that with a closer look at the codes and/or some careful reformulation of algorithms, it would be possible to use single precision more often and thus open up more potential applications of AltiVec. BUT, and this is always a big BUT, why go to all this trouble perfecting the code for G4/AltiVec if your code already runs fast on P4/Linux systems without any fuss? Sometimes it's just not practical. I got lucky with Jet3D; vectorization and optimization for AltiVec was relatively painless. Out of about 5000 lines of code, only 10 "key" lines of coded needed to be tweaked. There may be more cases like this out there.

This discussion leaves out a critical point: price/ performance relation!

For applications that can take advantage of AltiVec, the G4 becomes extremely cost effective, especially when you consider the great desktop UNIX workstation role that OS X can serve (a single G4 OS X box replaced both an SGI workstation and an older G3 Mac on my desk).

Even in clustering, where custom-built P4/Linux systems are the hot commodity right now, the G4 can be competitive with AltiVec (but I think its strong point is on the desktop).

One thing to keep in mind about G4 vs. P4 comparisons for clustering are issues like power and heat. If you have to spend $10-15K to equip your office with industrial strength AC to handle the heat put out by P4 systems, the price/performance issue takes a serious turn. So there are lots of things to consider, and performance is only one small part. As a result, I use both Macs and PC/Linux systems in my daily work -- whatever is best and most cost-effective for the task at hand.

Link to comment
Share on other sites

Beccatevi questa lunga lettera estrapolata da piu' missive di questo simpatico signore che lavora alla Nasa e presa dal sito di Ric Ford.

Interessante soprattutto per Chest (Posted Image)

Mi sembra molto interessante per vari motivi:

lavora in un ambiente misto, lavora sfruttando a fondo i G4, e' in grado di comprendere svantaggi e vantaggi di entrambe le le piattaforme hardware)

---

Craig Hunter . NASA

G4 Performance

I have been following the discussion of Rob Galbraith's benchmarks with much interest, as I have spent a good deal of time testing, optimizing, and benchmarking software for the G4 (OS X) and P4 (Linux).

The first thing to realize is that there are numerous benchmarks that show the P4 is faster, and there are numerous benchmarks that show the G4 is faster. What matters? Well, probably the benchmarks that apply to the kind of work you do. For people doing photo processing with the software Rob tested, his results are extremely relevant. But, someone working with a program optimized for AltiVec and dual processors might have a completely opposite experience.

Just to give an example of a benchmark that goes the other way, see this chart.

http://members.cox.net/craig.hunter/jet3d.jpg

(You're welcome to mirror this benchmark image, since my web site may not handle a lot of traffic). These real-world results come from the Jet3D computational fluid dynamics noise prediction software, which I developed for my doctoral thesis and currently use in my work at NASA. Jet3D is written in a combination of FORTRAN 77, FORTRAN 90, and C, and is optimized for AltiVec and dual processors on G4 hardware. When compiled on Linux using Intel's ifc compiler tools, Jet3D also becomes optimized for the P4 (using the various SIMD extensions available on the P4).

As you can see, the G4 does quite well here. A dual processor 1.25GHz G4 system is more than 3.5X faster than a single processor 2GHz P4 system. Though it's not shown on the chart, a single 1.25GHz G4 processor benchmarks at about 1589 MFLOPS, 1.9X faster than the P4. If you look at MFLOPS per MHz for a single processor, the G4 comes in at 1.27 MFLOPS/MHz, while the P4 comes in at 0.42 MFLOPS/MHz. If you want a good example of the MHz myth, look at the Cray, which comes in at 1.78 MFLOPS/MHz with only a 500MHz processor, beating both the G4 and P4.

Without AltiVec, the Jet3D benchmark would be about 794 MFLOPS on the dual-1.25GHz G4, which erases the performance lead over the P4. And then, using only a single processor, the 1.25GHz G4 benchmarks at about 418 MFLOPS, which is about half as fast as the P4. And all of a sudden, the G4 doesn't look very compelling. For the Jet3D benchmark, AltiVec and dual processors are key (AltiVec more so than dual procs). This is true for most benchmarks I have looked at; thus numerically intensive applications that can't use AltiVec and/or dual processors are likely to suffer on the G4.

In the case of Jet3D, it was easy to optimize for AltiVec. I was able to hand-vectorize about 10 lines of code within the guts of the FORTRAN algorithm and convert the computations to C for easy access to AltiVec hardware instructions. It had a huge effect for not a lot of work. For other more complicated cases, it may be possible to use the VAST compiler tools to automatically vectorize and tie in with AltiVec (VAST has parallel tools also). But in some cases, vectorization is not possible or feasible. In those instances, you're stuck with the processor's scalar performance, and the P4 generally has better scalar performance than the G4 in my experience. One final note: these are my personal views, and do not represent the views of NASA Langley Research Center, NASA, or the United States Government, nor do they constitute an endorsement by NASA Langley Research Center, NASA, or the United States Government.

I test the machines that are common in my world of science/engineering, and right now, that means P4/Linux systems and G4/OSX systems. Two years ago, I would also have included various legacy UNIX workstation systems in there, and then you might have 5-6 different operating systems being compared. It's just not possible to do academic benchmarks here; I work in a mixed environment.

I have done a good deal of benchmarking using Yellow Dog Linux and Black Lab Linux on G4 hardware in the past. Linux is certainly much leaner than OS X, and I currently feel that Linux is preferable for clustering applications. But the performance differences between Linux and OS X on G4 hardware are small compared to the performance differences between G4 hardware and P4 hardware. In cases where the P4 is significantly faster, using OS X or Linux on the G4 won't make much of a difference. Similarly, in cases where the G4 is the compelling choice for speed, it's less of an issue which OS is used. Many benchmarks and tests I run are maxing out various parts of the hardware architecture (bus, memory, etc...) and there's not a whole lot that the OS can do to help or hurt that. Both Linux and OS X are able to get out of the way when running dedicated computations in single user mode (no GUI) and that's what counts.

Actually, Jet3D is about 99% double precision. I was able to reformulate a "key" vector algorithm into single precision to take advantage of AltiVec by properly non-dimensionalizing the computation and using some common sense. The rest of the code remains double precision as required (plus, there would be little to gain from converting the rest of the code to use AltiVec anyhow, never mind that it's impractical).

A lot of us in science and engineering use codes that were written in double precision because it's required, but an equal number of codes use double precision and don't really need it (or only need it in certain parts of the code). I believe that with a closer look at the codes and/or some careful reformulation of algorithms, it would be possible to use single precision more often and thus open up more potential applications of AltiVec. BUT, and this is always a big BUT, why go to all this trouble perfecting the code for G4/AltiVec if your code already runs fast on P4/Linux systems without any fuss? Sometimes it's just not practical. I got lucky with Jet3D; vectorization and optimization for AltiVec was relatively painless. Out of about 5000 lines of code, only 10 "key" lines of coded needed to be tweaked. There may be more cases like this out there.

This discussion leaves out a critical point: price/ performance relation!

For applications that can take advantage of AltiVec, the G4 becomes extremely cost effective, especially when you consider the great desktop UNIX workstation role that OS X can serve (a single G4 OS X box replaced both an SGI workstation and an older G3 Mac on my desk).

Even in clustering, where custom-built P4/Linux systems are the hot commodity right now, the G4 can be competitive with AltiVec (but I think its strong point is on the desktop).

One thing to keep in mind about G4 vs. P4 comparisons for clustering are issues like power and heat. If you have to spend $10-15K to equip your office with industrial strength AC to handle the heat put out by P4 systems, the price/performance issue takes a serious turn. So there are lots of things to consider, and performance is only one small part. As a result, I use both Macs and PC/Linux systems in my daily work -- whatever is best and most cost-effective for the task at hand.

Link to comment
Share on other sites

Beccatevi questa lunga lettera estrapolata da piu' missive di questo simpatico signore che lavora alla Nasa e presa dal sito di Ric Ford.

Interessante soprattutto per Chest (Posted Image)

Mi sembra molto interessante per vari motivi:

lavora in un ambiente misto, lavora sfruttando a fondo i G4, e' in grado di comprendere svantaggi e vantaggi di entrambe le le piattaforme hardware)

---

Craig Hunter . NASA

G4 Performance

I have been following the discussion of Rob Galbraith's benchmarks with much interest, as I have spent a good deal of time testing, optimizing, and benchmarking software for the G4 (OS X) and P4 (Linux).

The first thing to realize is that there are numerous benchmarks that show the P4 is faster, and there are numerous benchmarks that show the G4 is faster. What matters? Well, probably the benchmarks that apply to the kind of work you do. For people doing photo processing with the software Rob tested, his results are extremely relevant. But, someone working with a program optimized for AltiVec and dual processors might have a completely opposite experience.

Just to give an example of a benchmark that goes the other way, see this chart.

http://members.cox.net/craig.hunter/jet3d.jpg

(You're welcome to mirror this benchmark image, since my web site may not handle a lot of traffic). These real-world results come from the Jet3D computational fluid dynamics noise prediction software, which I developed for my doctoral thesis and currently use in my work at NASA. Jet3D is written in a combination of FORTRAN 77, FORTRAN 90, and C, and is optimized for AltiVec and dual processors on G4 hardware. When compiled on Linux using Intel's ifc compiler tools, Jet3D also becomes optimized for the P4 (using the various SIMD extensions available on the P4).

As you can see, the G4 does quite well here. A dual processor 1.25GHz G4 system is more than 3.5X faster than a single processor 2GHz P4 system. Though it's not shown on the chart, a single 1.25GHz G4 processor benchmarks at about 1589 MFLOPS, 1.9X faster than the P4. If you look at MFLOPS per MHz for a single processor, the G4 comes in at 1.27 MFLOPS/MHz, while the P4 comes in at 0.42 MFLOPS/MHz. If you want a good example of the MHz myth, look at the Cray, which comes in at 1.78 MFLOPS/MHz with only a 500MHz processor, beating both the G4 and P4.

Without AltiVec, the Jet3D benchmark would be about 794 MFLOPS on the dual-1.25GHz G4, which erases the performance lead over the P4. And then, using only a single processor, the 1.25GHz G4 benchmarks at about 418 MFLOPS, which is about half as fast as the P4. And all of a sudden, the G4 doesn't look very compelling. For the Jet3D benchmark, AltiVec and dual processors are key (AltiVec more so than dual procs). This is true for most benchmarks I have looked at; thus numerically intensive applications that can't use AltiVec and/or dual processors are likely to suffer on the G4.

In the case of Jet3D, it was easy to optimize for AltiVec. I was able to hand-vectorize about 10 lines of code within the guts of the FORTRAN algorithm and convert the computations to C for easy access to AltiVec hardware instructions. It had a huge effect for not a lot of work. For other more complicated cases, it may be possible to use the VAST compiler tools to automatically vectorize and tie in with AltiVec (VAST has parallel tools also). But in some cases, vectorization is not possible or feasible. In those instances, you're stuck with the processor's scalar performance, and the P4 generally has better scalar performance than the G4 in my experience. One final note: these are my personal views, and do not represent the views of NASA Langley Research Center, NASA, or the United States Government, nor do they constitute an endorsement by NASA Langley Research Center, NASA, or the United States Government.

I test the machines that are common in my world of science/engineering, and right now, that means P4/Linux systems and G4/OSX systems. Two years ago, I would also have included various legacy UNIX workstation systems in there, and then you might have 5-6 different operating systems being compared. It's just not possible to do academic benchmarks here; I work in a mixed environment.

I have done a good deal of benchmarking using Yellow Dog Linux and Black Lab Linux on G4 hardware in the past. Linux is certainly much leaner than OS X, and I currently feel that Linux is preferable for clustering applications. But the performance differences between Linux and OS X on G4 hardware are small compared to the performance differences between G4 hardware and P4 hardware. In cases where the P4 is significantly faster, using OS X or Linux on the G4 won't make much of a difference. Similarly, in cases where the G4 is the compelling choice for speed, it's less of an issue which OS is used. Many benchmarks and tests I run are maxing out various parts of the hardware architecture (bus, memory, etc...) and there's not a whole lot that the OS can do to help or hurt that. Both Linux and OS X are able to get out of the way when running dedicated computations in single user mode (no GUI) and that's what counts.

Actually, Jet3D is about 99% double precision. I was able to reformulate a "key" vector algorithm into single precision to take advantage of AltiVec by properly non-dimensionalizing the computation and using some common sense. The rest of the code remains double precision as required (plus, there would be little to gain from converting the rest of the code to use AltiVec anyhow, never mind that it's impractical).

A lot of us in science and engineering use codes that were written in double precision because it's required, but an equal number of codes use double precision and don't really need it (or only need it in certain parts of the code). I believe that with a closer look at the codes and/or some careful reformulation of algorithms, it would be possible to use single precision more often and thus open up more potential applications of AltiVec. BUT, and this is always a big BUT, why go to all this trouble perfecting the code for G4/AltiVec if your code already runs fast on P4/Linux systems without any fuss? Sometimes it's just not practical. I got lucky with Jet3D; vectorization and optimization for AltiVec was relatively painless. Out of about 5000 lines of code, only 10 "key" lines of coded needed to be tweaked. There may be more cases like this out there.

This discussion leaves out a critical point: price/ performance relation!

For applications that can take advantage of AltiVec, the G4 becomes extremely cost effective, especially when you consider the great desktop UNIX workstation role that OS X can serve (a single G4 OS X box replaced both an SGI workstation and an older G3 Mac on my desk).

Even in clustering, where custom-built P4/Linux systems are the hot commodity right now, the G4 can be competitive with AltiVec (but I think its strong point is on the desktop).

One thing to keep in mind about G4 vs. P4 comparisons for clustering are issues like power and heat. If you have to spend $10-15K to equip your office with industrial strength AC to handle the heat put out by P4 systems, the price/performance issue takes a serious turn. So there are lots of things to consider, and performance is only one small part. As a result, I use both Macs and PC/Linux systems in my daily work -- whatever is best and most cost-effective for the task at hand.

Link to comment
Share on other sites

Olaf,

<u>Per quanto riguarda la ram, in base alla decrizione della tua esperienza mi viene in mente un difetto della ram. se il computer parte allora quel tipo di ram funziona. Se così non fosse, il computer non partirebbe nemmeno.</u>

A me è successo che con 256MB di PC100 su un G4/450 l'uso ripetuto del timbro in Photoshop mi procurasse orribili serie di tre, quattro pixel ogni volta di un colore primario diverso che scomparivano solo timbrandole a loro volta. Mi ero convinto fosse un bug di Photoshop od un problema alla MB; poi ho aggiunto un modulo da altri 256 ed il difetto è miracolosamente scomparso: vedi tu.

<u>Maselli>Il registro non esiste per l'utente normale. Se tu ti ritieni un utente esperto e ci guardi ok, però non pretendere sia semplice: devo ancora guardare i file di configurazione del mac os X (aspetto che il mio amico ottenga il 10.2), però non penso siano chiarissimi.</u>

Non sono affatto un utente esperto di Windows ma, generalmente, quando mi metto in testa di fare qualcosa la faccio e mi sonoi messo in testa di diventarlo . Posted Image .

<u>Stessa cosa per il registro: la maggior parte delle opzioni sono accessibili dal pannello di controllo.</u>

Magari...

<u>devo ancora guardare i file di configurazione del mac os X (aspetto che il mio amico ottenga il 10.2), però non penso siano chiarissimi.</u>

Conosco uno che mi ha detto di ricompilare con regolarità il Kernel di OsX; gli ho chiesto se il Mac continuasse a funzionare e mi ha risposto di si (a patto di rispettare un paio di 'cose').

Come vedi è tutto soggettivo.

Gianni

Link to comment
Share on other sites

Olaf,

<u>Per quanto riguarda la ram, in base alla decrizione della tua esperienza mi viene in mente un difetto della ram. se il computer parte allora quel tipo di ram funziona. Se così non fosse, il computer non partirebbe nemmeno.</u>

A me è successo che con 256MB di PC100 su un G4/450 l'uso ripetuto del timbro in Photoshop mi procurasse orribili serie di tre, quattro pixel ogni volta di un colore primario diverso che scomparivano solo timbrandole a loro volta. Mi ero convinto fosse un bug di Photoshop od un problema alla MB; poi ho aggiunto un modulo da altri 256 ed il difetto è miracolosamente scomparso: vedi tu.

<u>Maselli>Il registro non esiste per l'utente normale. Se tu ti ritieni un utente esperto e ci guardi ok, però non pretendere sia semplice: devo ancora guardare i file di configurazione del mac os X (aspetto che il mio amico ottenga il 10.2), però non penso siano chiarissimi.</u>

Non sono affatto un utente esperto di Windows ma, generalmente, quando mi metto in testa di fare qualcosa la faccio e mi sonoi messo in testa di diventarlo . Posted Image .

<u>Stessa cosa per il registro: la maggior parte delle opzioni sono accessibili dal pannello di controllo.</u>

Magari...

<u>devo ancora guardare i file di configurazione del mac os X (aspetto che il mio amico ottenga il 10.2), però non penso siano chiarissimi.</u>

Conosco uno che mi ha detto di ricompilare con regolarità il Kernel di OsX; gli ho chiesto se il Mac continuasse a funzionare e mi ha risposto di si (a patto di rispettare un paio di 'cose').

Come vedi è tutto soggettivo.

Gianni

Link to comment
Share on other sites

Olaf,

<u>Per quanto riguarda la ram, in base alla decrizione della tua esperienza mi viene in mente un difetto della ram. se il computer parte allora quel tipo di ram funziona. Se così non fosse, il computer non partirebbe nemmeno.</u>

A me è successo che con 256MB di PC100 su un G4/450 l'uso ripetuto del timbro in Photoshop mi procurasse orribili serie di tre, quattro pixel ogni volta di un colore primario diverso che scomparivano solo timbrandole a loro volta. Mi ero convinto fosse un bug di Photoshop od un problema alla MB; poi ho aggiunto un modulo da altri 256 ed il difetto è miracolosamente scomparso: vedi tu.

<u>Maselli>Il registro non esiste per l'utente normale. Se tu ti ritieni un utente esperto e ci guardi ok, però non pretendere sia semplice: devo ancora guardare i file di configurazione del mac os X (aspetto che il mio amico ottenga il 10.2), però non penso siano chiarissimi.</u>

Non sono affatto un utente esperto di Windows ma, generalmente, quando mi metto in testa di fare qualcosa la faccio e mi sonoi messo in testa di diventarlo . Posted Image .

<u>Stessa cosa per il registro: la maggior parte delle opzioni sono accessibili dal pannello di controllo.</u>

Magari...

<u>devo ancora guardare i file di configurazione del mac os X (aspetto che il mio amico ottenga il 10.2), però non penso siano chiarissimi.</u>

Conosco uno che mi ha detto di ricompilare con regolarità il Kernel di OsX; gli ho chiesto se il Mac continuasse a funzionare e mi ha risposto di si (a patto di rispettare un paio di 'cose').

Come vedi è tutto soggettivo.

Gianni

Link to comment
Share on other sites

Max: bella l'elaborazione sul "tema" del corpo femminile. Se però al posto di f ci mettevi y (iniziale di yoni, vale a dire l'equivalente di f... nel sanscrito antico del Kamasutra/Kamashastra} il tutto probabilmente veniva anche meglio :D

Oohh...e così finalmente sappiamo anche qual è il "tipo" femminile del Grande Veglio. Benebenebene ;D

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share