Is it possibe to use the GPU to speed up Created 5 years ago2014-01-13 22:20:58 UTC by ninja defuse ninja defuse

Created 5 years ago2014-01-13 22:20:58 UTC by ninja defuse ninja defuse

Posted 5 years ago2014-01-13 22:21:38 UTC Post #317495
Is it possible to USE the GPU power to make map compilation faster?

been wondering if it is possible? maybe we need a tool or something?
Posted 5 years ago2014-01-13 22:30:39 UTC Post #317496
Ive been asking this question for ages, yes its possible, but noones done it :(
We could probably reduce compile times to a few seconds.
rufee rufeeSledge fanboy
Posted 5 years ago2014-01-13 23:47:31 UTC Post #317497
Shouldn't there be an API or something that emulates CPU computing on GPUs by now?
Striker StrikerI seriously doubt myself
Posted 5 years ago2014-01-14 00:32:33 UTC Post #317498
It's not that simple, but there are a number of standard programming languages and APIs that allow relatively easy use of GPU computation. It'd be a huge amount of work though, and you would need to know a shitload about the compilation process to be able to do it. I don't know enough about the compilation process to even know if it can be sped up via GPU programming (probably yes, but I couldn't say for sure).
Penguinboy PenguinboyHaha, I died again!
Posted 5 years ago2014-01-14 00:41:28 UTC Post #317499
Wow this would be amazing, do want!

I just got a processor upgrade(i7-3770 3.4GHz), so I'm really anxious to test compile times against my old 2.26GHZ core II duo, but man what you're saying is the compile would be instantaneous?

It would be fun to try a lot of crazy things, things you would never ordinarily had the patience to wait out a 12-hour compile for =P
Captain Terror Captain Terrorwhen a man loves a woman
Posted 5 years ago2014-01-14 01:59:36 UTC Post #317502
I can tell you Captain terror that the current HL1 open-space map I'm working at compiles in about 20 minutes on my desktop E6500 and about 3 minutes on my i7-3610QM laptop. Probably i7-3770 is way faster than that. It's beacause VHLT is optimized for multi-core I guess.
Striker StrikerI seriously doubt myself
Posted 5 years ago2014-01-14 09:08:29 UTC Post #317506
Posted 5 years ago2014-01-14 10:17:37 UTC Post #317509
That post is ancient ninja, OpenCL is generally the more common CPU compute language because it is more portable and it can even default back to CPU when a GPU is not available. CUDA still exists but isn't very common because it only works on Nvidia graphics cards.
Penguinboy PenguinboyHaha, I died again!
Posted 5 years ago2014-01-14 10:37:59 UTC Post #317510
If the compile process can be parallelized as much as possible then the GPU would have an advantage. I don't know much about the compile process either, but i believe at least in RAD that you can start lighting calculations in different points thus having more tasks run (advantage GPU) and meet at the final point once all work is done.
The problem with having GPU do all the work, is that everything has to be uploaded into its memory before you run the task and it uses it while its compiling, so with old GPU's and huge maps you could run into problems where you run out of GPU ram.
Id love if someone made a working implementation, id also love to make this myself if i had the knowledge how to :)
rufee rufeeSledge fanboy
Posted 5 years ago2014-01-14 12:37:23 UTC Post #317511
There must be some kind of way of running an application inside a virtual computing system that emulates CPU using the GPU. Just like installing OSes in a virtual machine. I wonder if such a software exists?
Striker StrikerI seriously doubt myself
Posted 5 years ago2014-01-14 13:14:05 UTC Post #317512
is it so damn hard to wait a few mins? u can make a toast and stuff
Posted 5 years ago2014-01-14 13:15:27 UTC Post #317513
It doesn't work like that. Code written for a (serial) CPU would almost always run slower, if at all, on a GPU even if perfect emulation was available (it isn't, afaik). The processing code needs to be written specifically for the purpose of parallel programming. Multicore CPUs are starting to get people thinking in that direction, but you can't just summon "GPU magic" and get everything instantly working.

From a quick look online, experimental libraries like C++ AMP appear to be simplifying things and making GPU compute available from a standard C++ application, but you need to use the data structures and language macros defined by the library, it cannot just be bolted onto code and be expected to work out of the box.
Penguinboy PenguinboyHaha, I died again!
Posted 5 years ago2014-01-14 17:36:15 UTC Post #317515
Penguinboy is right. CPU's and GPU's are fundamentally different processors.

CPU's are about doing one set of instructions on one set of data really really fast. GPU's are about doing one set of instructions on multiple sets of data in parallel at a slower rate, but with higher throughput so more work gets done. They emphasize bandwidth over latency.

GPU's outperform CPU's in raw computation with well designed GPU algorithms. GPU's get slow when you start introducing lots of branching into the code. This is when you run different bits of code depending on data conditions. "If this condition is true, run that code, otherwise run that other code." When a CPU branches, it just follows out the code path and it's no big deal; It's only doing one thing at a time anyway. When a GPU branches, the core running the GPU code has to run the code for one branch on a subset of the threads that it is running in parallel. This means 12/1024 threads might be doing work while the others sit and wait. Then it has to run the code for the other branch, where 1012 threads are doing work while 12 threads wait. When one set of data causes a branch, the other set of data that doesn't follow the branch has to sit and do nothing. In the worst case, every individual thread will make a different branching decision. This is considerably slower than a CPU. If you design an algorithm in a way that it doesn't need branching, you can keep every thread occupied and doing work.

A good example of a parallel operation is when you want to find the largest number in a set of numbers.

0 4 28 49 8 14 94 03

A CPU will just run through these numbers sequentially and find the largest number. A GPU can do what is called a parallel reduction.

0 4 28 49 8 14 94 3

First we break the numbers up into groups of 2. A single GPU thread will decide which of the numbers is larger.

(0 > 4) | (28 < 49) | (8 < 14) | (94 > 3)
4 49 14 94

Here we have 4 GPU threads which operate in parallel at the same time. This is equivalent to doing "one" operation instead of 4 separate ones like on the CPU. Then you keep doing it until you have your one number.

(4 < 49) | (14 < 94)
49 94

(49 < 94)
94

And there's the answer. This effectively reduces us from doing X operations to log2(X) operations. There are more sophisticated ways to do this, but this gives you an idea of the strength of a GPU.

Lighting computations can definitely benefit from GPU computation. I can understand why Valve might not want to do that though. They've probably made too many CPU-specific assumptions. It would most likely be faster to rewrite it from scratch than to "fix" it to work efficiently on a GPU.
TheGrimReafer TheGrimReaferADMININATOR
Posted 5 years ago2014-01-16 03:22:36 UTC Post #317531
Wow.
I feel very lucky that there's so many smart people on this site.
Thanks for that tidbit grim!
Tetsu0 Tetsu0Original Cowboy
Posted 5 years ago2014-01-18 10:03:23 UTC Post #317542
User posted image
Striker StrikerI seriously doubt myself
You must be logged in to post a response.