-
Notifications
You must be signed in to change notification settings - Fork 1
Add streams results metric data. #57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
This relates to RPOPC-758 |
malucius-rh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
frival
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm inclined to approve but I have a couple questions first, because some things aren't made readily clear (and github's mangling of the formatting doesn't help). Take a deep breath folks.
First, this looks like it results in one archive per socket count, with all optimizations and array sizes and iterations for each array size in the same archive, is that correct? That means for our "normal" 2 socket systems we'd get two archives per run. It looks like we're recording separate metrics for each of Copy/Add/Scale/Triad which makes me happy. Do we want to have one archive per socket count (or are my old eyes just giving me a challenge with alignment in the diff output)? I'm not asking if that's the easiest way to do it but whether that's ideal when we have to use this for analysis.
Second, I don't see a log of the run, I see some logged output that github has mangled beyond belief but that's not the same. Ideally we'd be showing the full run output (e.g. a bash -x ./streams_run) - we got so aggressive about swallowing output because of what it did to the logs and screen of a multi-system run that I think we're now missing very valuable output that could make things better. I'd like that so we can ensure there are no other weird errors or warnings that we're silently ignoring unintentionally.
|
Test output is located in https://github.com/user-attachments/files/24529237/streams_O2_virtual-guest.txt (as in the submit). We have to make a decision, we can put the output directly in the pr or in an attachment. My take, small output in the pr, large amount in the attachment. pcp archive name |
That's not the output of the full wrapper, that's only part of the run.
I agree it will make the pmrep output harder to read, but doing it this way also makes comparing across sizes more complicated as we have to merge archives. This way is easier to parse the pmrep output, the other is easier to compare different sizes etc. Both have up sides and down sides, I just want to be sure we're intentional about what we're choosing here vs. taking the first available option. This also is very confusing in the log, it's probably expected but it's darned confusing: Finally, because I'm being pedantic this week, ideally for a test like this we'd also have logs from a multi-NUMA-node system and not just a 4 vCPU instance. We're requiring those be run according to the new rules, so including the logs from them would be helpful. |
If we ever get support for annotations in PCP where we can inject things like " here beginneth 16384k O2" into the archive the One Big Archive method will become easier to handle. Until then for cases like this where we effectively have sizes * optlevels separate tests which happen to be grouped for runtime convenience we may want to consider (even though I really don't like the approach) logging the "config knobs" as metrics before the individual runs, treating them as ugly annotations. |
for string, and we filter out the line properly.
recorded at once.
Description
Streams pcp is broken, we are not setting anything properly. This fixes that issue
Before/After Comparison
Before:
Seeing:
Logging results iteration_1 stream.36608k_iter_
Unexpected metric logged. Check for a Typo
After
Above message not seen.
[root@ip-170-0-17-77 pcp_2026.01.09-12.59.58]# pmrep -p -a
streams_size_stream.36608k_opt_level_2_threads_4_sockets_1.0 openmetrics.workload
pmrep -p -a streams_size_stream.36608k_opt_level_2_threads_4_sockets_1.0 openmetrics.workload
12:49:56 1.000 1.000 0.000 NaN NaN NaN 21504.40 24648.500 25414.1 25307.400
12:49:57 1.000 1.000 0.000 NaN NaN NaN 21504.40 24648.500 25414.1 25307.400
12:49:58 1.000 1.000 0.000 NaN NaN NaN 21504.40 24648.500 25414.1 25307.400
12:49:59 0.000 0.000 0.000 NaN NaN NaN NaN NaN NaN NaN
Fix details:
Added openmetric file for Add, Copy, Scale, Triad
Moved the array size and iteration loops so all iterations happens for a array size at once.
pcp file now contains information of interest: array size, opt level, number threads, array size, and number of sockets. We could probably push this all into one file, but made more sense to me to have one file for each of those.
Fix the following line (separate commit)
info=
grep "${search_for}" ${file}* | tr -s " " | sed "s/ /:/g" | cut -d: -f 4to be
info=
grep -h "${search_for}:" ${file}* | tr -s " " | sed "s/ /:/g" | tr -s ':' | cut -d: -f 2The -h eliminates the file name being the first field if we have multiple files working with.
The tr -s ':' ensures that we have only a single ':' not multiples together.
Clerical Stuff
This closes #56
Relates to JIRA: RPOPC-758
Test results
Command executed
/home/ec2-user/workloads/streams-wrapper-2.1/streams/streams_run --run_user ec2-user --home_parent /home --iterations 1 --tuned_setting tuned_none_sys_file_ --host_config "m5.xlarge" --sysname "m5.xlarge" --sys_type aws --iterations 5 - --use_pcp
csv file
Test general meta start
Test: streams
Results version: 1.0
Host: m5.xlarge
Sys environ: aws
Tuned: virtual-guest
OS: 5.14.0-611.5.1.el9_7.x86_64
Numa nodes: 1
CPU family: Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
Number cpus: 4
Memory: 15899880kB
Test general meta end
Test meta data start
Optimization level: O2
kernel_rev --meta_output numa_nodes
number_cpus
Core(s)_per_socket
Model_name
streams_version_# 5.10
Test meta data end
1 Socket
Array sizes:36608k:73216k:146432k:292864k
Copy:21504:21629:21560:22514
Scale:24733:24831:24855:24882
Add:25572:25603:25648:25671
Triad:25444:25540:25570:25597
Test meta data start
Optimization level: O3
kernel_rev --meta_output numa_nodes
number_cpus
Core(s)_per_socket
Model_name
streams_version_# 5.10
Test meta data end
1 Socket
Array sizes:36608k:73216k:146432k:292864k
Copy:21704:21587:21553:22478
Scale:24928:24933:24948:24914
Add:25742:25502:25823:25426
Triad:25614:25470:25735:25381
partial pcp output
pmrep -p -a streams_size_stream.36608k_opt_level_2_threads_4_sockets_1.0 openmetrics.workload
12:49:56 1.000 1.000 0.000 NaN NaN NaN 21504.40 24648.500 25414.1 25307.400
12:49:57 1.000 1.000 0.000 NaN NaN NaN 21504.40 24648.500 25414.1 25307.400
12:49:58 1.000 1.000 0.000 NaN NaN NaN 21504.40 24648.500 25414.1 25307.400
12:49:59 0.000 0.000 0.000 NaN NaN NaN NaN NaN NaN NaN
=========================
Streams run ouput.
streams_x_out.txt