Skip to content

Some issues seen during import apache logging #137

@sgoeting

Description

@sgoeting

Hey ,

I don't know if I am at the right place but in my eyes it is the core of my observations

lately we've been doing a lot of log import in Matomo. We use the import_logs.py tool available in [matomo-path] / misc / log-anaytics.

Initially we used the following command:
./misc/log-analytics/import_logs.py --url = [matomo host] --recorders = 8 --replay-tracking [path-to-logfile]

As long as the log files were not too large, say less than 1 Gb, the process went quite smoothly and we did not encounter any real problems. But the apache log files keep getting bigger. As the log files getting bigger, we noticed that our QT processing (a cron job scheduled per minute) was becoming increasingly difficult to process the queues. The QT-processing queues grew (a lot). Normally, the QT processing can get a grip at the situation and the queues will decrease and return to the normal state.

In previous mentioned situation, QT processing still does that, but we also noticed some strange behavior. When the import was finished we expected, when QT processing kicked in, the queues whould decrease soon after, but they didn't. Queues continued to grow and grow and only after a few hours would they decrease. But there is also another thing. There was something that caught our attention. With the QT monitor next to it, we saw that the queues grew, but the memory footprint became smaller, see the attached screenshot.

Image Pasted at 2020-7-7 14-27

But now our questions:

  1. What is the relationship of QT processing with the log import? We considered them separate, not related processes.
  2. How is it possible that the QT queues grow while at the same time the memory claim decreases?

Oh, for the record, we're using a slightly different import command now:
./misc/log-analytics/import_logs.py --url = [matomo host] --recorders = 8 --replay-tracking --request-suffix = queuedtracking = 0 [path-to-logfile]
And along with this command we disabled the QTprocessing per cronjob and enabled the per request QT processing with a batchsize of 1

This command/configuration does not show the side effects of growing QT processing queues. Can you confirm this observation?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions