Long Activity Video Understanding using Functional Object-Oriented Network

Paper:
- arXiv (best version)
- IEEE Xplore

NOTES

This work leverages two deep networks to jointly perform object detection and action recognition, which would be used for matching to object and motion nodes found in functional unit structures in a FOON.
Follow-up work by Ahmad explored more fine-grained object-state recognition for cooking tasks.
- This work is built upon a state taxonomy for cooking-related images.

Citation

Jelodar, A. B., Paulius, D., and Sun, Y. (2018) “Long Activity Video Understanding using Functional Object-Oriented Network”. In: IEEE Transactions on Multimedia, vol. 21(7), pp. 1813–1824.

Related Links:

NOTES

Citation