L. Eeckhout, Heterogeneity in Response to the Power Wall, IEEE Micro, vol.35, issue.4, pp.2-3, 2015.

M. A. Heroux, J. Dongarra, and P. Luszczek, HPCG Benchmark Technical Specification, 2013.

K. Li, IVY: a shared virtual memory system for parallel computing, Proc. 1988 Intl. Conf. on Parallel Processing, pp.94-101, 1988.

C. Amza, A. L. Cox, S. Dwarkadas, P. Keleher, . Honghui-lu et al., TreadMarks: shared memory computing on networks of workstations, Computer, vol.29, issue.2, pp.18-28, 1996.

G. Antoniu, M. Bertier, L. Bougé, E. Caron, F. Desprez et al., GDS: An Architecture Proposal for a Grid Data-Sharing Service, Future Generation Grids, vol.6, pp.133-152
URL : https://hal.archives-ouvertes.fr/hal-01431487

J. A. Ross and D. A. Richie, Implementing OpenSHMEM for the Adapteva Epiphany RISC Array Processor, Procedia Computer Science, vol.80, pp.2353-2356, 2016.

J. Nelson, B. Holt, B. Myers, P. Briggs, L. Ceze et al., USENIX LISA 15 House Advertisement, IEEE Software, vol.32, issue.5, pp.c4-c4, 2015.

S. Kaxiras, D. Klaftenegger, M. Norgren, A. Ros, and K. Sagonas, Turning Centralized Coherence and Distributed Critical-Section Execution on their Head, Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing - HPDC '15, pp.3-14, 2015.

L. Cudennec, Software-Distributed Shared Memory over Heterogeneous Micro-server Architecture, Euro-Par 2017: Parallel Processing Workshops, pp.366-377, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01679052

L. Cudennec, Merging the Publish-Subscribe Pattern with the Shared Memory Paradigm, Lecture Notes in Computer Science, pp.469-480, 2018.
URL : https://hal.archives-ouvertes.fr/cea-01896787

J. Stuecheli, W. J. Starke, J. D. Irish, L. B. Arimilli, D. Dreps et al., IBM POWER9 opens up a new era of acceleration enablement: OpenCAPI, IBM Journal of Research and Development, vol.62, issue.4/5, pp.8:1-8:8, 2018.

. Ccix-consortium, Full Issue PDF, JACC: Case Reports, vol.1, issue.5, pp.I-CCIX, 2019.

J. Cong, . Bin-liu, S. Neuendorffer, J. Noguera, K. Vissers et al., High-Level Synthesis for FPGAs: From Prototyping to Deployment, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol.30, issue.4, pp.473-491, 2011.

Y. S. Shao, S. L. Xi, V. Srinivasan, G. Wei, and D. Brooks, Co-designing accelerators and SoC interfaces using gem5-Aladdin, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp.1-12, 2016.

J. Cong, Z. Fang, M. Gill, and G. Reinman, PARADE: A cycle-accurate full-system simulation Platform for Accelerator-Rich Architectural Design and Exploration, 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pp.380-387, 2015.

L. Feng, H. Liang, S. Sinha, and W. Zhang, HeteroSim: A Heterogeneous CPU-FPGA Simulator, IEEE Computer Architecture Letters, vol.16, issue.1, pp.38-41, 2017.

T. Liang, L. Feng, S. Sinha, and W. Zhang, PAAS: A system level simulator for heterogeneous computing architectures, 2017 27th International Conference on Field Programmable Logic and Applications (FPL), pp.1-8, 2017.

N. Binkert, B. Beckmann, G. Black, S. K. Reinhardt, A. Saidi et al., The gem5 simulator, ACM SIGARCH Computer Architecture News, vol.39, issue.2, pp.1-7, 2011.

R. Ubal, B. Jang, P. Mistry, D. Schaa, and D. Kaeli, Multi2Sim, Proceedings of the 21st international conference on Parallel architectures and compilation techniques - PACT '12, pp.335-344, 2012.

W. Snyder, Verilator: the fast free verilog simulator, 2012.

C. M. Kirchsteiger, H. Schweitzer, C. Trummer, C. Steger, R. Weiss et al., A software performance simulation methodology for rapid system architecture exploration, 2008 15th IEEE International Conference on Electronics, Circuits and Systems, pp.494-497, 2008.

M. K. Papamichael, J. C. Hoe, and O. Mutlu, FIST, Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip - NOCS '11, pp.137-144, 2011.

K. E. Murray, O. Petelin, S. Zhong, J. M. Wang, M. Eldafrawy et al., VTR 8, ACM Transactions on Reconfigurable Technology and Systems, vol.13, issue.2, pp.1-55, 2020.

S. Seeley, V. Sankaranaryanan, Z. Deveau, P. Patros, and K. B. Kent, Simulation-based circuit-activity estimation for FPGAs containing hard blocks, Proceedings of the 28th International Symposium on Rapid System Prototyping Shortening the Path from Specification to Prototype - RSP '17, pp.36-42, 2017.

A. Wicaksana, A. Charif, C. Andriamisaina, and N. Ventroux, Hybrid Prototyping Methodology for Rapid System Validation in HW/SW Co-Design, 2019 Conference on Design and Architectures for Signal and Image Processing (DASIP), pp.35-40, 2019.
URL : https://hal.archives-ouvertes.fr/cea-02494007

F. G. Gustavson, Two Fast Algorithms for Sparse Matrices: Multiplication and Permuted Transposition, ACM Transactions on Mathematical Software, vol.4, issue.3, pp.250-269, 1978.

T. A. Davis and Y. Hu, The university of Florida sparse matrix collection, ACM Transactions on Mathematical Software, vol.38, issue.1, pp.1-25, 2011.