Skip to content

Conversation

@shihab-dls
Copy link
Contributor

@shihab-dls shihab-dls commented Mar 20, 2025

When we put to a command PV, the put returns immediately, not waiting for the bound callback to complete. Adding the blocking=True flag sets the PVs PACT field high until the bound callback sets the PVs status to completed. This should allow an ophyd-async SignalX to be properly awaited.

I'm not sure if this behaviour should be set like this, however; is it always the case we'd want to wait for a callback when processing a command PV?

@codecov
Copy link

codecov bot commented Mar 20, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.63%. Comparing base (1cce8ea) to head (72aeb86).
Report is 18 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #141      +/-   ##
==========================================
+ Coverage   90.59%   90.63%   +0.03%     
==========================================
  Files          41       41              
  Lines        1935     1943       +8     
==========================================
+ Hits         1753     1761       +8     
  Misses        182      182              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@evvaaaa
Copy link
Contributor

evvaaaa commented Mar 21, 2025

I'm not sure if this behaviour should be set like this, however; is it always the case we'd want to wait for a callback when processing a command PV?

Hmmm this may require some thought on how to standardise.

PVA

In pva I set the command widget to true while the command is running, then false after it has finished.

class CommandPvHandler:
def __init__(self, command: CommandCallback):
self._command = command
self._task_in_progress = False
async def _run_command(self) -> dict:
self._task_in_progress = True
try:
await self._command()
except Exception as e:
alarm_states = p4p_alarm_states(
MAJOR_ALARM_SEVERITY, RECORD_ALARM_STATUS, str(e)
)
else:
alarm_states = p4p_alarm_states()
self._task_in_progress = False
return alarm_states
async def put(self, pv: SharedPV, op: ServerOperation):
value = op.value()
raw_value = value["value"]
if raw_value is True:
if self._task_in_progress:
raise RuntimeError(
"Received request to run command but it is already in progress. "
"Maybe the command should spawn an asyncio task?"
)
# Flip to true once command task starts
pv.post({"value": True, **p4p_timestamp_now(), **p4p_alarm_states()})
op.done()
alarm_states = await self._run_command()
pv.post({"value": False, **p4p_timestamp_now(), **alarm_states})
else:
raise RuntimeError("Commands should only take the value `True`.")

This means that awaiting a put to a SignalX will wait for the command to start runnning rather than for the command to finish.

I don't mind changing this to do blocking until the command is complete, but if so we should agree on this as a standard.

Unless I'm mistaken, the pva analogue of the above would be to

             # Flip to true after the command ends
             alarm_states = await self._run_command()
             pv.post({"value": True, **p4p_timestamp_now(), **alarm_states}) 
             op.done()

and remove the True/False logic, a command records value is always True. @coretl

@GDYendell
Copy link
Contributor

I could imagine you might want either... so it probably needs to be a flag on Command and @command?

@evvaaaa
Copy link
Contributor

evvaaaa commented Mar 21, 2025

I could imagine you might want either... so it probably needs to be a flag on Command and @command?

I was thinking the same, though we probably want it to be a kwarg of type:

class CommandMode(enum.Enum):
    #: Command value becomes `True` in the transport only
    #: once the command starts, becomes `False` after it 
    #: finishes.
    CONTINUE_AFTER_START = "CONTINUE_ONCE_STARTED"

    #: Command value becomes `True` in the transport only after
    #: the command has finished. Becomes `False` immediately after.
    CONTINUE_AFTER_FINISH = "CONTINUE_AFTER_FINISH"

@GDYendell
Copy link
Contributor

GDYendell commented Mar 21, 2025

I think CONTINUE_AFTER_FINISH should be high after (or just before) start, low after finish. Otherwise clients will timeout waiting for it to go high.

@evvaaaa
Copy link
Contributor

evvaaaa commented Mar 21, 2025

Yea that's much nicer.

class CommandMode(enum.Enum):
    #: Command value becomes `True` in the transport once
    #: the command starts, becomes `False` after it finishes.
    HIGH_AFTER_START = "HIGH_AFTER_START"

    #: Command value becomes `True` in the transport only after
    #: the command has finished. Becomes `False` immediately after.
    HIGH_AFTER_FINISH = "HIGH_AFTER_FINISH"

@GDYendell
Copy link
Contributor

@shihab-dls is there an issue to do with arming eiger that this would fix?

@shihab-dls
Copy link
Contributor Author

@shihab-dls is there an issue to do with arming eiger that this would fix?

This should unblock #551, where the Eiger's arm signal is now a SignalX. This is incompatible with current functionality of ophyd-async's set_and_wait_for_other_value, which we used to use to wait for the detectors State signal to flag that arming is complete. So blocking on the pvs put here would avert the need to use this function at all.

@GDYendell
Copy link
Contributor

@coretl @DominicOram do you have any thoughts on the behaviour we might want here?

@shihab-dls
Copy link
Contributor Author

I've just pushed some rough commits outlining the behaviour we're describing (to my understanding). @GDYendell and @evalott100 did I follow correctly? Happy to revert the changes if it's decided HIGH_AFTER_FINISH is all we need.

@coretl
Copy link
Contributor

coretl commented Mar 21, 2025

I think we always want blocking=True, as we can always to the areaDetector thing of caput -c in the background and wait until the readback is True to tell an operation has started, and still have something to wait on to tell when it finishes.

So the order should be:

  • Receive a caput/pvput
  • Start the task
  • Set the value to True
  • Wait for the task to complete
  • Set the value to False
  • pv.post/finish the on_update call

@GDYendell
Copy link
Contributor

So we don't need the option @shihab-dls, it should always do Action(blocking=True) and the pva version should be updated to match @coretl's description above.

@evvaaaa
Copy link
Contributor

evvaaaa commented Mar 21, 2025

So the order should be:

* Receive a `caput`/`pvput`

* Start the task

* Set the value to `True`

* Wait for the task to complete

* Set the value to `False`

* `pv.post`/finish the `on_update` call

This would be equivalent to what we have in the PVA transport right now, so long as we swap the done around to be after everything in the CommandPvHandler:

            # Flip to true once command task starts
            pv.post({"value": True, **p4p_timestamp_now(), **p4p_alarm_states()})
            alarm_states = await self._run_command()
            pv.post({"value": False, **p4p_timestamp_now(), **alarm_states})
            op.done()
        else:
            raise RuntimeError("Commands should only take the value `True`.")

This causes the behavior we expect for commands:

async def do_nothing():
    print("STARTED COMMAND")
    await asyncio.sleep(10)
    print("ENDED COMMAND")

class ControllerLongNames(Controller):
    command_short_name = Command(do_nothing)
  • Put times out:
[:(] pvput -w 5 MEH:CommandShortName true
Old : 2025-03-21 14:24:35.584  false 
Put timeout
  • Though the flag is set during the run:
[:)] pvget  -w 5 MEH:CommandShortName 
MEH:CommandShortName 2025-03-21 14:24:41.214  true 
  • ... and is unset after:
[:)] pvget  -w 5 MEH:CommandShortName 
MEH:CommandShortName 2025-03-21 14:24:41.214  false 

Suggestion

For equivalence to the p4p backend we change from using a builder.Action to a builder.BoolOut so that we have the flag instead of single value.

blocking=True and blocking=False on builder.Action doesn't actually seem to make a difference in whether the caput on the pv times out, so we'll need to find out how to delay the operation to after the command ends.

@GDYendell
Copy link
Contributor

blocking=True and blocking=False on builder.Action doesn't actually seem to make a difference in whether the caput on the pv times out, so we'll need to find out how to delay the operation to after the command ends.

I don't think this is what you found, is it @shihab-dls ?

@shihab-dls
Copy link
Contributor Author

shihab-dls commented Mar 24, 2025

blocking=True and blocking=False on builder.Action doesn't actually seem to make a difference in whether the caput on the pv times out, so we'll need to find out how to delay the operation to after the command ends.

I don't think this is what you found, is it @shihab-dls ?

I've been testing this out; it seems it is true that the record value is set immediately, which means that:

@command()
async def comm(self):
    print("Started command")
    await asyncio.sleep(5)
    print("Finished command")

with blocking=True, results in:

$ pvput -w 1 blah:Comm 1
Old : 2025-03-24 10:45:27.418  (0) 
New : 2025-03-24 10:48:52.454  (1) 

where the put does not timeout. Similarly, a pvget results in:

$ pvget -w 1 blah:Comm
blah:Comm 2025-03-24 10:50:21.889  (1)

where the new record value is immediately available.

However, the operation completion is only triggered after the callback completes, which mean monitors are updated after the callback:

$ pvput -w 1 blah:Comm 1
Old : 2025-03-24 11:03:32.414  (0) 
New : 2025-03-24 11:03:52.857  (1)
$ pvmonitor blah:Comm
blah:Comm 2025-03-24 11:03:32.414  (0) 

where the monitor does not update until 5 seconds later:

$ pvmonitor blah:Comm
blah:Comm 2025-03-24 11:03:32.414  (0) 
blah:Comm 2025-03-24 11:03:57.858  (1)

If you were to pvget the records PACT field instead of its value, you would see the required behaviour of:

  • Receive a caput/pvput
  • Start the task
  • Set the value to True
  • Wait for the task to complete
  • Set the value to False
  • pv.post/finish the on_update call

for example:

$ pvput -w 1 blah:Comm 1
Old : 2025-03-24 11:08:12.388  (0) 
New : 2025-03-24 11:08:28.699  (1) 
$ pvget blah:Comm.PACT
blah:Comm.PACT 2025-03-24 11:15:36.878  1 

followed by:

$ pvget blah:Comm.PACT
blah:Comm.PACT 2025-03-24 11:15:41.879  0 

which I believe is why @evvaaaa wants us using

the flag instead of single value.

It seems, however, that a put with callback waits for server processing to be complete, which means that it correctly waits for the PACT field to go low before returning, giving us our desired arm() behaviour on detectors.

TLDR

Setting blocking=True is not directly analogous to the changes in pva, because the records value field is updated immediately, rather than:

  • Set the value to True
  • Wait for the task to complete
  • Set the value to False

This means a pvput or pvget will return immediately.

However, the records PACT field does follow the order:

  • Set the value to True
  • Wait for the task to complete
  • Set the value to False

Which means that doing a put with callback will correctly wait for callback completion before returning.

@DominicOram
Copy link
Contributor

@coretl @DominicOram do you have any thoughts on the behaviour we might want here?

I don't have much to contribute to the details of this but I agree with @coretl that the callback should only return on the process being completed. Otherwise we have no reliable way to tell something has completed from the client side. If a client chooses not to wait for that callback then that's up to individual clients

@shihab-dls
Copy link
Contributor Author

These final commits should now have this PR doing the following:

  • Set op.done() after callback is complete for pva
  • Set blocking=True for ca

which should have our put callbacks, regardless of backend, return after a bound method has completed.

@coretl
Copy link
Contributor

coretl commented Mar 24, 2025

These final commits should now have this PR doing the following:

* Set `op.done()` after callback is complete for pva

* Set `blocking=True` for ca

which should have our put callbacks, regardless of backend, return after a bound method has completed.

Note that is for CA via the pythonSoftIOC transport, and PVA via the p4p transport. PVA via the pythonSoftIOC transport currently doesn't wait for completion due to epics-base/pvxs#105. This is not a blocker for FastCS (as we will use PVA via the p4p transport, so will probably turn it off in pythonSoftIOC via DiamondLightSource/pythonSoftIOC#185), but we will need this later for areaDetector IOCs when we go full PVA on beamlines.

@GDYendell GDYendell changed the title Add Blocking Flag to Command PVS Add Blocking Flag to Command PVs Mar 25, 2025
GDYendell
GDYendell previously approved these changes Mar 25, 2025
Copy link
Contributor

@GDYendell GDYendell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good now. I suggest we have a brief discussion in the stand up to make sure we are in agreement and then merge.

@coretl
Copy link
Contributor

coretl commented Mar 25, 2025

We should make sure that we match the behaviour of the IOCs, so if we pvput it doesn't wait for completion, but if we pvput -r 'record[block=true]' -w then it does as per epics-base/pvxs#105 (comment)

@shihab-dls
Copy link
Contributor Author

shihab-dls commented Mar 26, 2025

We should make sure that we match the behaviour of the IOCs, so if we pvput it doesn't wait for completion, but if we pvput -r 'record[block=true]' -w then it does as per epics-base/pvxs#105 (comment)

Looks like with blocking=True on the softIOC backend, pvput -r 'record[block=true]' -w does wait for the callback to complete.

$ time pvput -r "record[block=true]" -w 10 blah:Comm 1
Old : <undefined>              INVALID DRIVER UDF (0) 
New : 2025-03-26 12:29:02.427  (1) 

real	0m5.009s

where we timeout if -w 4 (i.e., process takes > 4s)

$ time pvput -r "record[block=true]" -w 4 blah:Comm 1
Old : 2025-03-26 12:32:44.686  (0) 
Put timeout

real	0m4.009s

but the value is changed when we check afterwards:

$ pvget blah:Comm
blah:Comm 2025-03-26 12:33:02.479  (1)

Blocking Behaviour Post Merge:

Command SoftIOC WAITS? P4P WAITS?
caput -c -w <TIMEOUT> <PREFIX>:<COMMAND_PV> <VALUE> YES N/a
caput -w <TIMEOUT> <PREFIX>:<COMMAND_PV> <VALUE> NO N/a
caput <PREFIX>:<COMMAND_PV> <VALUE> NO N/a
pvput -r 'record[block=true]' -w <TIMEOUT> <PREFIX>:<COMMAND_PV> <VALUE> YES YES
pvput -w <TIMEOUT> <PREFIX>:<COMMAND_PV> <VALUE> NO YES
pvput <PREFIX>:<COMMAND_PV> <VALUE> NO YES (with default timeout=5s)

Is this the behaviour that was agreed upon, @coretl and @GDYendell? Where P4P waits for callback completion regardless of additional flags (where -w is sort of required to ensure no timeout)?

@shihab-dls
Copy link
Contributor Author

shihab-dls commented Mar 28, 2025

After pythonSoftIOC #186, behaviour will be:

Command SoftIOC WAITS? P4P WAITS?
caput -c -w <TIMEOUT> <PREFIX>:<COMMAND_PV> <VALUE> YES N/a
caput -w <TIMEOUT> <PREFIX>:<COMMAND_PV> <VALUE> NO N/a
caput <PREFIX>:<COMMAND_PV> <VALUE> NO N/a
pvput -r 'record[block=true]' -w <TIMEOUT> <PREFIX>:<COMMAND_PV> <VALUE> N/a YES
pvput -w <TIMEOUT> <PREFIX>:<COMMAND_PV> <VALUE> N/a NO
pvput <PREFIX>:<COMMAND_PV> <VALUE> N/a NO

Where PVA is over P4P, CA is over softIOC.

Current PR behaviour:

Command SoftIOC WAITS? P4P WAITS?
caput -c -w <TIMEOUT> <PREFIX>:<COMMAND_PV> <VALUE> YES N/a
caput -w <TIMEOUT> <PREFIX>:<COMMAND_PV> <VALUE> NO N/a
caput <PREFIX>:<COMMAND_PV> <VALUE> NO N/a
pvput -r 'record[block=true]' -w <TIMEOUT> <PREFIX>:<COMMAND_PV> <VALUE> YES YES
pvput -w <TIMEOUT> <PREFIX>:<COMMAND_PV> <VALUE> NO NO
pvput <PREFIX>:<COMMAND_PV> <VALUE> NO NO

Which should better match epics-base/pvxs#105 (comment)

@shihab-dls
Copy link
Contributor Author

PR should be ready for re-review now.

Note:
We import from pvi._format.dls import DLSFormatter, which now fails linting with the new pyright release due to reportPrivateImportUsage. I've type ignored this for now, but it should be addressed in PVI at some point.

@shihab-dls shihab-dls requested review from GDYendell and coretl March 28, 2025 14:15
@shihab-dls shihab-dls requested a review from GDYendell March 31, 2025 09:24
GDYendell
GDYendell previously approved these changes Mar 31, 2025
@shihab-dls shihab-dls merged commit 2c546da into main Mar 31, 2025
18 checks passed
@shihab-dls shihab-dls deleted the block_command_records_during_callback branch March 31, 2025 12:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants