Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug on xf::resize after upgrading from 2018.3 to 2019.1 #59

Open
3togo opened this issue Jun 21, 2019 · 9 comments
Open

Bug on xf::resize after upgrading from 2018.3 to 2019.1 #59

3togo opened this issue Jun 21, 2019 · 9 comments

Comments

@3togo
Copy link

3togo commented Jun 21, 2019

prj_xf_dummy_2019.1.zip
The following function can run using xfopencv 2018.3 but not when using 2019.1. Why?
What is the meaning of "Call parameter type does not match function signature!"?

xf::resize <INTERPOLATION, XF_8UC1, wk_rows, wk_cols, dst_rows, dst_cols, XF_NPPC1, MAXDOWNSCALE> (wk_disp, disp);

Call parameter type does not match function signature!
  %wk_disp.data.V = alloca [38400 x i8], align 1
 [307200 x i8]*  call fastcc void @"xf::resize<0, 0, 160, 240, 480, 640, 1, 20>"(i32* %wk_disp.rows, i32* %wk_disp.cols, [38400 x i8]* %wk_disp.data.V, i32* %disp.rows, i32* %disp.cols, [307200 x i8]* %disp.data.V), !dbg !31033
Broken module found, compilation aborted!
Stack dump:
0.	Running pass 'Function Pass Manager' on module '/home/eli/git/bjp/sfsfsfsfsf/ips/xf_test/prj_xf_test_2019.1/solution1/.autopilot/db/a.g.1.bc'.
1.	Running pass 'Module Verifier' on function '@xf_filter_simple'
/tools/Xilinx/Vivado/2019.1/bin/rdiArgs.sh: line 267:  8517 Aborted                 (core dumped) "$RDI_PROG" "$@"

prj_xf_dummy_2019.1.zip

@3togo
Copy link
Author

3togo commented Jun 22, 2019

The error may be because:

A component cannot include virtual functions, function pointers, or bit fields.

@3togo
Copy link
Author

3togo commented Jun 22, 2019

Below is the output dumps

INFO: [XFORM 203-602] Inlining function 'xf::Mat<0, 480, 640, 1>::write' into 'resizeNNBilinear<0, 160, 480, 1, 480, 640, 1, 2>' (/opt/xfopencv/2019.1/include/imgproc/xf_resize_nn_bilinear.hpp:423) automatically.
INFO: [XFORM 203-602] Inlining function 'xf::Mat<0, 480, 640, 1>::read' into 'xf::xfMat2AXIvideo<24, 0, 480, 640, 1>' (/opt/xfopencv/2019.1/include/common/xf_infra.h:70->/opt/xfopencv/2019.1/include/common/xf_infra.h:83->/opt/xfopencv/2019.1/include/common/xf_infra.h:202) automatically.
Call parameter type does not match function signature!
  %ss.data.V = alloca [76800 x i8], align 1
 [307200 x i8]*  call fastcc void @"xf::resize<1, 0, 480, 640, 160, 480, 1, 2>"(i32* %src.rows, i32* %src.cols, [307200 x i8]* %src.data.V, i32* %ss.rows, i32* %ss.cols, [76800 x i8]* %ss.data.V), !dbg !26192
Call parameter type does not match function signature!
  %ss.data.V = alloca [76800 x i8], align 1
 [307200 x i8]*  call fastcc void @"xf::resize<1, 0, 160, 480, 480, 640, 1, 2>"(i32* %ss.rows, i32* %ss.cols, [76800 x i8]* %ss.data.V, i32* %disp.rows, i32* %disp.cols, [307200 x i8]* %disp.data.V), !dbg !26193
Broken module found, compilation aborted!
Stack dump:
0.	Running pass 'Function Pass Manager' on module '/home/eli/git/bjp/sfsfsfsfsf/ips/xf_dummy/prj_xf_dummy_2019.1/solution1/.autopilot/db/a.g.1.bc'.
1.	Running pass 'Module Verifier' on function '@xf_dummy_filter'
/tools/Xilinx/Vivado/2019.1/bin/rdiArgs.sh: line 267:  7313 Aborted                 (core dumped) "$RDI_PROG" "$@"
Warning: HLS Process returned an error, skipping report opening!
Aborted!

@3togo
Copy link
Author

3togo commented Jun 27, 2019

syn.log

I have attached a testing program prj_xf_dummy_2019.1.zip above. Can anyone confirm whether the bug affects everyone or only affects me?

Attached is the synthesis log.
syn.log

@bgouthamb
Copy link
Contributor

@3togo
We acknowledge this issue. It is a bug with the 2019.1 Vivado HLS Synthesis flow, wherein it throws the error you reported, when a function with xf::Mat interfaces is instantiated more than once with different parameters (Basically creating different RTL for the two instances). In your case, the xf::resize function is called twice with different template parameters.

The fix has to come from Vivado HLS tool, which might take some time. However, you can consider the following workarounds:

  1. Create a Class and paste the xf:resize as a member function. Create new Objects of the class when multiple calls needs to be made to the function.

    example:

Class resizeWrapper
{
< copy the resize function definition here >

};

In the accel function:

resizeWrapper obj1, obj2;

obj1.resize<., 1080, 1920,..> (... , ...);
obj2.resize<., 2160, 3840,..> (... , ...);

  1. Copy the whole code of resize function into another function with a different name, like resize2. Use that function for the second call.

    example:

resize <., 1080, 1920,..> (... , ...);
resize2<., 2160, 3840,..> (... , ...);

@3togo
Copy link
Author

3togo commented Jul 2, 2019

@bgouthamb ,

Many thanks for your reply. I did modify the code as suggested by your workaround 1. But, it makes no difference. The errors are still there. Below is the modified code.

Eli

#include "xf_dummy_accel.h"
#include "xf_config_params.h"
#include "imgproc/xf_resize.hpp"

#define MAXDOWNSCALE 2
#define INTERPOLATION   1
class JJ {
    public:
        template<int INTERPOLATION_TYPE, int TYPE, int SRC_ROWS, int SRC_COLS, int DST_ROWS, int DST_COLS, int NPC, int MAX_DOWN_SCALE>
        void resize (xf::Mat<TYPE, SRC_ROWS, SRC_COLS, NPC> & _src, xf::Mat<TYPE, DST_ROWS, DST_COLS, NPC> & _dst) {
            xf::resize<INTERPOLATION_TYPE, TYPE, SRC_ROWS, SRC_COLS, DST_ROWS, DST_COLS, NPC, MAX_DOWN_SCALE> (_src, _dst);
        }
};

void xf_dummy_filter(AXI_STREAM& src_data, AXI_STREAM& src_data_d)
{
#pragma HLS INTERFACE axis port=src_data
#pragma HLS INTERFACE axis port=src_data_d


XGRAY_SRC_IMAGE src(src_rows, src_cols);
XGRAY_SS_IMAGE ss(ss_rows, ss_cols);
XGRAY_DST_IMAGE disp(dst_rows, dst_cols);
JJ jj1, jj2;


#pragma HLS dataflow
xf::AXIvideo2xfMat(src_data, src);
//xf::resize <INTERPOLATION, XF_8UC1, src_rows, src_cols, ss_rows, ss_cols, XF_NPPC1, MAXDOWNSCALE> (src, ss);
//xf::resize <INTERPOLATION, XF_8UC1, ss_rows, ss_cols, dst_rows, dst_cols, XF_NPPC1, MAXDOWNSCALE> (ss, disp);
jj1.resize <INTERPOLATION, XF_8UC1, src_rows, src_cols, ss_rows, ss_cols, XF_NPPC1, MAXDOWNSCALE> (src, ss);
jj2.resize <INTERPOLATION, XF_8UC1, ss_rows, ss_cols, dst_rows, dst_cols, XF_NPPC1, MAXDOWNSCALE> (ss, disp);

xf::xfMat2AXIvideo(disp, src_data_d);
}

@bgouthamb
Copy link
Contributor

@3togo ,

The Class JJ has to have the whole definition of xf::resize including any sub-functions it calls internally. #include "imgproc/xf_resize.hpp" should be removed.
Assuming that you use BILINEAR interpolation method, here is how the class should be defined:


class JJ {

    public:

	/***********************************************************************/

	template<int DEPTH, int INTERPOLATION_TYPE, int NPPC>
	void interpolatePixel(XF_CTUNAME(DEPTH,NPPC) A0, XF_CTUNAME(DEPTH,NPPC) B0, XF_CTUNAME(DEPTH,NPPC) A1, XF_CTUNAME(DEPTH,NPPC) B1, ap_ufixed<12,2> Wx, ap_ufixed<12,2> Wy, XF_CTUNAME(DEPTH,NPPC) &pixel)
	{
	#pragma HLS inline
		if(INTERPOLATION_TYPE==XF_INTERPOLATION_NN)
		{
			pixel = A0;
		}
		else
		{
			ap_ufixed<12,2> Wxy;
			ap_int<16> val0,val1,val2;
			ap_fixed<28,18> P1,P2,P3,P4;
			ap_ufixed<28,18> one_num = 1.0;

			Wxy = (Wx*Wy);    // Wx - 0.32, Wy-0.32  (Wx*Wy-0.64)  Wxy - 0.32
			val0 = (A0+B1-(B0+A1));
			val1 = (B0-A0);
			val2 = (A1-A0);

			P1 = (val0*Wxy);		// val0(16.0) * Wxy(0.32) = P1(16.32)
			P2 = (val1*Wy);		// val1(16.0) * Wy(0.32) = P2(16.32)
			P3 = (val2*Wx);		// val1(16.0) * Wx(0.32) = P3(16.32)
			P4 = (A0);					// A0(8.0) P4(8.32)

			pixel = (XF_CTUNAME(DEPTH,NPPC))((ap_fixed<32,22>)(P1  + P2 + P3 + P4));
			// to get only integer part from sum of 8.32's , right shift by 32
		}
	}
	template<int DEPTH, int INTERPOLATION_TYPE, int NPPC, int T_INDEX_INT, int NUMBEROFINPUTWORDS>
	void computeOutputPixel(XF_TNAME(DEPTH,NPPC) A0[NUMBEROFINPUTWORDS], XF_TNAME(DEPTH,NPPC) B0[NUMBEROFINPUTWORDS], ap_uint<T_INDEX_INT> initIndex, ap_uint<T_INDEX_INT> indexx[XF_NPIXPERCYCLE(NPPC)], ap_ufixed<12,2> Wx[XF_NPIXPERCYCLE(NPPC)], ap_ufixed<12,2> Wy, XF_TNAME(DEPTH,NPPC) &pixel)
	{
	#pragma HLS inline
		const int PIXELDEPTH = XF_DTPIXELDEPTH(DEPTH,NPPC);
		/*if(indexx[XF_NPIXPERCYCLE(NPPC)-1] > (initIndex+NUMBEROFINPUTWORDS*XF_NPIXPERCYCLE(NPPC)-1))
			{
				std::cout << "Insufficient number of words to resize in X" << std::endl;
				return;
			}*/
		assert((indexx[XF_NPIXPERCYCLE(NPPC)-1] < (initIndex+NUMBEROFINPUTWORDS*XF_NPIXPERCYCLE(NPPC)-1)) && "Insufficient number of words to resize in X");

		XF_PTUNAME(DEPTH) unpackX1[XF_NPIXPERCYCLE(NPPC)*NUMBEROFINPUTWORDS];
	#pragma HLS ARRAY_PARTITION variable=unpackX1 complete dim=1
		XF_PTUNAME(DEPTH) unpackX2[XF_NPIXPERCYCLE(NPPC)*NUMBEROFINPUTWORDS];
	#pragma HLS ARRAY_PARTITION variable=unpackX2 complete dim=1
		XF_PTUNAME(DEPTH) outputPixel[XF_NPIXPERCYCLE(NPPC)];
	#pragma HLS ARRAY_PARTITION variable=outputPixel complete dim=1
		for(int k=0; k<NUMBEROFINPUTWORDS; k++)
		{
	#pragma HLS UNROLL
			for(int i=0; i<XF_NPIXPERCYCLE(NPPC); i++)
			{
	#pragma HLS UNROLL
				unpackX1[k*XF_NPIXPERCYCLE(NPPC)+i] = A0[k].range((i+1)*XF_DTPIXELDEPTH(DEPTH,NPPC)*XF_CHANNELS(DEPTH,NPPC)-1,i*XF_DTPIXELDEPTH(DEPTH,NPPC)*XF_CHANNELS(DEPTH,NPPC));
				unpackX2[k*XF_NPIXPERCYCLE(NPPC)+i] = B0[k].range((i+1)*XF_DTPIXELDEPTH(DEPTH,NPPC)*XF_CHANNELS(DEPTH,NPPC)-1,i*XF_DTPIXELDEPTH(DEPTH,NPPC)*XF_CHANNELS(DEPTH,NPPC));
			}
		}
		for(int i=0; i<XF_NPIXPERCYCLE(NPPC); i++)
		{
	#pragma HLS UNROLL

			for(int k=0; k<XF_CHANNELS(DEPTH,NPPC); k++)
			{
	#pragma HLS UNROLL
				XF_CTUNAME(DEPTH,NPPC) unpackX1temp[XF_NPIXPERCYCLE(NPPC)*NUMBEROFINPUTWORDS];
	#pragma HLS ARRAY_PARTITION variable=unpackX1temp complete dim=1
				XF_CTUNAME(DEPTH,NPPC) unpackX2temp[XF_NPIXPERCYCLE(NPPC)*NUMBEROFINPUTWORDS];
	#pragma HLS ARRAY_PARTITION variable=unpackX2temp complete dim=1
				for(int l=0; l<XF_NPIXPERCYCLE(NPPC)*NUMBEROFINPUTWORDS; l++)
				{
	#pragma HLS UNROLL
					unpackX1temp[l] = unpackX1[l].range((k+1)*PIXELDEPTH-1,k*PIXELDEPTH);
					unpackX2temp[l] = unpackX2[l].range((k+1)*PIXELDEPTH-1,k*PIXELDEPTH);
				}
				XF_CTUNAME(DEPTH,NPPC) currentoutput;
				interpolatePixel<DEPTH, INTERPOLATION_TYPE, NPPC>(unpackX1temp[indexx[i]-initIndex], unpackX2temp[indexx[i]-initIndex], unpackX1temp[indexx[i]-initIndex+1], unpackX2temp[indexx[i]-initIndex+1], Wx[i], Wy, currentoutput);
				outputPixel[i].range((k+1)*PIXELDEPTH-1,k*PIXELDEPTH) = currentoutput;
			}
		}

		for(int i=0; i<XF_NPIXPERCYCLE(NPPC); i++)
		{
	#pragma HLS UNROLL
			pixel.range((i+1)*XF_DTPIXELDEPTH(DEPTH,NPPC)*XF_CHANNELS(DEPTH,NPPC)-1,i*XF_DTPIXELDEPTH(DEPTH,NPPC)*XF_CHANNELS(DEPTH,NPPC)) = outputPixel[i];
		}
	}
	static uint64_t xfUDivResize (uint64_t in_n, unsigned short in_d)
	{
	#pragma HLS INLINE OFF
		uint32_t out_res = in_n/in_d;
		return out_res;
	}

	template<int NPPC, int T_SCALE_WIDTH, int T_SCALE_INT, int T_COMP_INDEX_WIDTH, int T_COMP_INDEX_INT>
	void scaleMult(ap_ufixed<T_SCALE_WIDTH,T_SCALE_INT> scalex, ap_fixed<T_COMP_INDEX_WIDTH,T_COMP_INDEX_INT> scaleXParallel[XF_NPIXPERCYCLE(NPPC)])
	{
	#pragma HLS INLINE
		for(int i=0; i<XF_NPIXPERCYCLE(NPPC); i++)
		{
	#pragma HLS PIPELINE
			scaleXParallel[i] = (ap_fixed<T_COMP_INDEX_WIDTH,T_COMP_INDEX_INT>)scalex*(ap_uint<8>)i;
		}
		return;
	}
	template<int T_INDEX_INT, int T_COMP_INDEX_WIDTH, int T_COMP_INDEX_INT, int T_SCALE_WIDTH, int T_SCALE_INT, int INTERPOLATION_TYPE>
	void scaleCompute(int currindex, ap_ufixed<T_SCALE_WIDTH,T_SCALE_INT> inscale, ap_fixed<T_COMP_INDEX_WIDTH,T_COMP_INDEX_INT> &ind_pre)
	{
		if(INTERPOLATION_TYPE==XF_INTERPOLATION_NN)
		{
			ind_pre = (ap_fixed<T_COMP_INDEX_WIDTH,T_COMP_INDEX_INT>)currindex*inscale + (ap_fixed<T_COMP_INDEX_WIDTH,T_COMP_INDEX_INT>)0.001;

		}
		else
		{
			ind_pre = ((ap_fixed<T_COMP_INDEX_WIDTH,T_COMP_INDEX_INT>)currindex + (ap_fixed<T_COMP_INDEX_WIDTH,T_COMP_INDEX_INT>)0.5)*inscale - (ap_fixed<T_COMP_INDEX_WIDTH,T_COMP_INDEX_INT>)0.5;
		}
	}
	template <int INTERPOLATION_TYPE, int T_COMP_INDEX_WIDTH, int T_COMP_INDEX_INT, int T_INDEX_INT, int T_SCALE_WIDTH, int T_SCALE_INT, int T_WEIGHT_WIDTH, int T_WEIGHT_INT, int NPPC>
	void computeInterpolation(int inrows, int incols, int j, int output_rows_count, ap_ufixed<T_SCALE_WIDTH,T_SCALE_INT> scalex, ap_fixed<T_COMP_INDEX_WIDTH,T_COMP_INDEX_INT> scaleXParallel[XF_NPIXPERCYCLE(NPPC)], ap_ufixed<T_SCALE_WIDTH,T_SCALE_INT> scaley, ap_uint<T_INDEX_INT> indexx[XF_NPIXPERCYCLE(NPPC)], ap_uint<T_INDEX_INT> &indexy, ap_uint<T_INDEX_INT> &nextYScale, ap_ufixed<T_WEIGHT_WIDTH,T_WEIGHT_INT> WeightX[XF_NPIXPERCYCLE(NPPC)], ap_ufixed<T_WEIGHT_WIDTH,T_WEIGHT_INT> &WeightY, ap_fixed<T_COMP_INDEX_WIDTH,T_COMP_INDEX_INT> indexx_pre_comp, ap_fixed<T_COMP_INDEX_WIDTH,T_COMP_INDEX_INT> indexy_pre_comp)
	{
		const int INDEX_INT = T_INDEX_INT;
		const int WEIGHT_WIDTH = T_WEIGHT_WIDTH;
		const int WEIGHT_INT = T_WEIGHT_INT;
		const int SCALE_WIDTH = T_SCALE_WIDTH;
		const int SCALE_INT = T_SCALE_INT;
		const int COMP_INDEX_WIDTH = T_COMP_INDEX_WIDTH;
		const int COMP_INDEX_INT = T_COMP_INDEX_INT;

		ap_fixed<COMP_INDEX_WIDTH,COMP_INDEX_INT> indexx_pre = 0;
		ap_fixed<COMP_INDEX_WIDTH,COMP_INDEX_INT> indexy_pre = 0;
		if(INTERPOLATION_TYPE==XF_INTERPOLATION_NN)
		{
			indexy_pre = indexy_pre_comp;
			nextYScale = indexy_pre+scaley;
			indexy = (ap_uint<INDEX_INT>)indexy_pre;
		}
		else
		{
			indexy_pre = indexy_pre_comp;
			nextYScale = indexy_pre+(ap_fixed<COMP_INDEX_WIDTH,COMP_INDEX_INT>)scaley;
			if(indexy_pre < 0)
			{
				indexy_pre = 0;
			}
			else if(indexy_pre > inrows-1)
			{
				indexy_pre = inrows-1;
			}
			indexy = (ap_uint<INDEX_INT>)indexy_pre;
			WeightY = ((ap_fixed<COMP_INDEX_WIDTH,COMP_INDEX_INT>)indexy_pre - (ap_fixed<COMP_INDEX_WIDTH,COMP_INDEX_INT>)indexy);
		}
		for(int i=0; i<XF_NPIXPERCYCLE(NPPC); i++)
		{
			ap_fixed<COMP_INDEX_WIDTH,COMP_INDEX_INT> indexy_pre = 0;
			if(INTERPOLATION_TYPE==XF_INTERPOLATION_NN)
			{
				indexx_pre = indexx_pre_comp + scaleXParallel[i];
				indexx[i] = (ap_uint<INDEX_INT>)indexx_pre;
			}
			else
			{
				indexx_pre = indexx_pre_comp + scaleXParallel[i];
				if(indexx_pre < 0)
				{
					indexx_pre = 0;
				}
				else if(indexx_pre > incols-1)
				{
					indexx_pre = incols-1;
				}
				indexx[i] = (ap_uint<INDEX_INT>)indexx_pre;
				WeightX[i] = ((ap_fixed<COMP_INDEX_WIDTH,COMP_INDEX_INT>)indexx_pre - (ap_fixed<COMP_INDEX_WIDTH,COMP_INDEX_INT>)indexx[i]);
			}
		}
	}

	template<int SRC_TYPE, int INHEIGHT, int INWIDTH, int NPPC, int OUTHEIGHT, int OUTWIDTH, int INTERPOLATION_TYPE, int MAX_DOWN_SCALE>
	void resizeNNBilinear(xf::Mat<SRC_TYPE, INHEIGHT, INWIDTH, NPPC> &imgInput,xf::Mat<SRC_TYPE, OUTHEIGHT, OUTWIDTH, NPPC> &imgOutput)
	{
	#pragma HLS ALLOCATION instances=scaleCompute limit=1 function
	#pragma HLS ALLOCATION instances=xfUDivResize limit=1 function
		const int INDEX_INT = 17;
		const int WEIGHT_WIDTH = 12;
		const int WEIGHT_INT = 2;
		const int SCALE_WIDTH = 32;
		const int SCALE_INT = 3;
		const int PRE_INDEX_WIDTH = 10;
		const int PRE_INDEX_INT = 17;
		const int COMP_INDEX_WIDTH = SCALE_WIDTH+PRE_INDEX_WIDTH;
		const int COMP_INDEX_INT = SCALE_INT+PRE_INDEX_INT;
		const int BUFFER_WORDS = MAX_DOWN_SCALE;
		const int BUFFER_DUP_FACTOR = (BUFFER_WORDS+1)>>1;

		uint64_t xnew,ynew;

		xnew = (imgInput.cols);///(float)(out_width<<XF_BITSHIFT(NPPC));
		ynew = (imgInput.rows);//(float)(out_height);

		xnew = xnew << 28;
		ynew = ynew << 28;
		ap_ufixed<SCALE_WIDTH,SCALE_INT> scalex,scaley;
		uint64_t Xscale64,Yscale64;
		Xscale64 = xfUDivResize (xnew , (imgOutput.cols));
		Yscale64 = xfUDivResize (ynew , (imgOutput.rows));
		ap_ufixed<64,32> temp_scale_conv;

		temp_scale_conv = Xscale64;
		temp_scale_conv = temp_scale_conv >> 28;
		scalex = temp_scale_conv;

		temp_scale_conv = Yscale64;
		temp_scale_conv = temp_scale_conv >> 28;
		scaley = temp_scale_conv;

		ap_fixed<COMP_INDEX_WIDTH,COMP_INDEX_INT> scaleXParallel[XF_NPIXPERCYCLE(NPPC)];
	#pragma HLS ARRAY_PARTITION variable=scaleXParallel complete dim=1
		scaleMult<NPPC,SCALE_WIDTH,SCALE_INT,COMP_INDEX_WIDTH,COMP_INDEX_INT>(scalex,scaleXParallel);

		XF_TNAME(SRC_TYPE,NPPC) line_buffer[3][BUFFER_DUP_FACTOR][INWIDTH>>(XF_BITSHIFT(NPPC))];
	#pragma HLS ARRAY_PARTITION variable=line_buffer complete dim=1
	#pragma HLS ARRAY_PARTITION variable=line_buffer complete dim=2
		int input_read_pointer=0;
		int read_rows_count = 0;
		int output_write_pointer = 0;
		for(int i=0; i<2; i++) //read two rows
		{
	#pragma HLS LOOP_TRIPCOUNT min=1 max=2
			for(int j=0; j<(imgInput.cols>>(XF_BITSHIFT(NPPC))); j++)
			{
	#pragma HLS PIPELINE
	#pragma HLS LOOP_TRIPCOUNT min=1 max=INWIDTH/NPPC
				for(int k=0; k<BUFFER_DUP_FACTOR ; k++)
				{
					line_buffer[i][k][j] = imgInput.read(input_read_pointer);
				}
				input_read_pointer++;
			}
			read_rows_count++;
		}
		int output_rows_count = 0;
		int first_row_index = 0;
		int second_row_index = 1;
		int read_row_index = 2;
		int loop_row_count = (imgOutput.rows > imgInput.rows)? imgOutput.rows : imgInput.rows;
		int loop_col_count = (imgOutput.cols > imgInput.cols)? imgOutput.cols : imgInput.cols;
		const int LOOPCOUNTROW = (INHEIGHT>OUTHEIGHT)? INHEIGHT: OUTHEIGHT;
		const int LOOPCOUNTCOL = (INWIDTH>OUTWIDTH)? INWIDTH: OUTWIDTH;
		ap_uint<INDEX_INT> indexx[XF_NPIXPERCYCLE(NPPC)];
	#pragma HLS ARRAY_PARTITION variable=indexx complete dim=1
		ap_uint<INDEX_INT> indexy = 0;
		ap_uint<INDEX_INT> nextYScale = 0;
		ap_ufixed<WEIGHT_WIDTH,WEIGHT_INT> WeightX[XF_NPIXPERCYCLE(NPPC)];
	#pragma HLS ARRAY_PARTITION variable=WeightX complete dim=1
		ap_ufixed<WEIGHT_WIDTH,WEIGHT_INT> WeightY = 0;
		XF_TNAME(SRC_TYPE,NPPC) P0Buf[BUFFER_DUP_FACTOR<<1];
	#pragma HLS ARRAY_PARTITION variable=P0Buf complete dim=1
		XF_TNAME(SRC_TYPE,NPPC) P1Buf[BUFFER_DUP_FACTOR<<1];
	#pragma HLS ARRAY_PARTITION variable=P1Buf complete dim=1

		ap_fixed<COMP_INDEX_WIDTH,COMP_INDEX_INT> indexx_pre_comp = 0;
		ap_fixed<COMP_INDEX_WIDTH,COMP_INDEX_INT> indexy_pre_comp = 0;

		for(int i=0; i<loop_row_count; i++)
		{
	#pragma HLS LOOP_TRIPCOUNT min=1 max=LOOPCOUNTROW
			scaleCompute<INDEX_INT, COMP_INDEX_WIDTH, COMP_INDEX_INT, SCALE_WIDTH, SCALE_INT, INTERPOLATION_TYPE>(output_rows_count, scaley, indexy_pre_comp);
			for(int j=0; j<(loop_col_count>>(XF_BITSHIFT(NPPC))); j++)
			{
	#pragma HLS PIPELINE
	#pragma HLS LOOP_TRIPCOUNT min=1 max=LOOPCOUNTCOL/NPPC

				scaleCompute<INDEX_INT, COMP_INDEX_WIDTH, COMP_INDEX_INT, SCALE_WIDTH, SCALE_INT, INTERPOLATION_TYPE>(j<<(XF_BITSHIFT(NPPC)), scalex, indexx_pre_comp);
				computeInterpolation<INTERPOLATION_TYPE, COMP_INDEX_WIDTH, COMP_INDEX_INT, INDEX_INT, SCALE_WIDTH, SCALE_INT, WEIGHT_WIDTH, WEIGHT_INT, NPPC>(imgInput.rows, imgInput.cols, j<<(XF_BITSHIFT(NPPC)), output_rows_count, scalex, scaleXParallel, scaley, indexx, indexy, nextYScale, WeightX, WeightY, indexx_pre_comp, indexy_pre_comp);
				int indexstores = first_row_index;
				XF_TNAME(SRC_TYPE,NPPC) read_pixel;
				bool flag_write = 0;
				if(read_rows_count != imgInput.rows)
				{
					if((nextYScale >= read_rows_count-1)) //check if the next index y needed needs to be read.
					{
						if(j<(imgInput.cols>>(XF_BITSHIFT(NPPC))))
						{
							read_pixel = imgInput.read(input_read_pointer);
							flag_write = 1;
							input_read_pointer++;
						}
						else
						{
							flag_write = 0;
						}
					}
					else
					{
						flag_write = 0;
					}
				}
				else
				{
					flag_write = 0;
				}

				if(indexstores == 0)
				{
					for(int k=0; k<BUFFER_DUP_FACTOR; k++)
					{
	#pragma HLS UNROLL
						int idx = (indexx[0]>>XF_BITSHIFT(NPPC))+(k<<1);
						int idx_nxt = idx + (indexx[0] == (imgInput.cols-1) ? 0 : 1);

						P0Buf[(k<<1)]   = line_buffer[0][k][idx];
						P0Buf[(k<<1)+1] = line_buffer[0][k][idx_nxt];
						P1Buf[(k<<1)]   = line_buffer[1][k][idx];
						P1Buf[(k<<1)+1] = line_buffer[1][k][idx_nxt];
					}
					if(flag_write)
					{
						for(int k=0; k<BUFFER_DUP_FACTOR; k++)
						{
	#pragma HLS UNROLL
							line_buffer[2][k][j] = read_pixel;
						}
					}
				}
				else if(indexstores == 1)
				{
					for(int k=0; k<BUFFER_DUP_FACTOR; k++)
					{
	#pragma HLS UNROLL
						int idx = (indexx[0]>>XF_BITSHIFT(NPPC))+(k<<1);
						int idx_nxt = idx + (indexx[0] == (imgInput.cols-1) ? 0 : 1);

						P0Buf[(k<<1)]   = line_buffer[1][k][idx];
						P0Buf[(k<<1)+1] = line_buffer[1][k][idx_nxt];
						P1Buf[(k<<1)]   = line_buffer[2][k][idx];
						P1Buf[(k<<1)+1] = line_buffer[2][k][idx_nxt];
					}
					if(flag_write)
					{
						for(int k=0; k<BUFFER_DUP_FACTOR; k++)
						{
	#pragma HLS UNROLL
							line_buffer[0][k][j] = read_pixel;
						}
					}
				}
				else
				{
					for(int k=0; k<BUFFER_DUP_FACTOR; k++)
					{
	#pragma HLS UNROLL
						int idx = (indexx[0]>>XF_BITSHIFT(NPPC))+(k<<1);
						int idx_nxt = idx + (indexx[0] == (imgInput.cols-1) ? 0 : 1);

						P0Buf[(k<<1)]   = line_buffer[2][k][idx];
						P0Buf[(k<<1)+1] = line_buffer[2][k][idx_nxt];
						P1Buf[(k<<1)]   = line_buffer[0][k][idx];
						P1Buf[(k<<1)+1] = line_buffer[0][k][idx_nxt];
					}
					if(flag_write)
					{
						for(int k=0; k<BUFFER_DUP_FACTOR; k++)
						{
	#pragma HLS UNROLL
							line_buffer[1][k][j] = read_pixel;
						}
					}
				}
				if((output_rows_count <= imgOutput.rows-1) && (((indexy == read_rows_count-1) && (read_rows_count == imgInput.rows)) || (indexy == read_rows_count-2)))
				{
					if(j<(imgOutput.cols>>(XF_BITSHIFT(NPPC))))
					{
						if(indexy == read_rows_count-1)
						{
							for(int k=0; k<BUFFER_WORDS; k++)
							{
	#pragma HLS UNROLL
								P0Buf[k] = P1Buf[k];
							}
						}
						XF_TNAME(SRC_TYPE,NPPC) temp_store_output;
						computeOutputPixel<SRC_TYPE,INTERPOLATION_TYPE,NPPC,INDEX_INT,BUFFER_WORDS>(P0Buf,P1Buf,((indexx[0]>>XF_BITSHIFT(NPPC))<<XF_BITSHIFT(NPPC)),indexx,WeightX,WeightY,temp_store_output);
						imgOutput.write(output_write_pointer,temp_store_output);
						output_write_pointer++;
					}
				}
			}
			if((output_rows_count <= imgOutput.rows-1) && (((indexy == read_rows_count-1) && (read_rows_count == imgInput.rows)) || (indexy == read_rows_count-2)))
			{
				output_rows_count++;
			}
			if(read_rows_count != imgInput.rows)
			{
				if((nextYScale >= read_rows_count-1)) //check if the next index y needed needs to be read.
				{
					first_row_index++;
					second_row_index++;
					read_row_index++;
					if(read_row_index == 3)
					{
						read_row_index = 0;
					}
					if(first_row_index == 3)
					{
						first_row_index = 0;
					}
					if(second_row_index == 3)
					{
						second_row_index = 0;
					}
					read_rows_count++;
				}
			}
		}
	}





	/***********************************************************************/

        template<int INTERPOLATION_TYPE, int TYPE, int SRC_ROWS, int SRC_COLS, int DST_ROWS, int DST_COLS, int NPC, int MAX_DOWN_SCALE>
        void resize (xf::Mat<TYPE, SRC_ROWS, SRC_COLS, NPC> & _src, xf::Mat<TYPE, DST_ROWS, DST_COLS, NPC> & _dst) {

#pragma HLS INLINE OFF

	assert(  ((INTERPOLATION_TYPE == XF_INTERPOLATION_NN)
	        ||(INTERPOLATION_TYPE == XF_INTERPOLATION_BILINEAR)
			||(INTERPOLATION_TYPE == XF_INTERPOLATION_AREA)) && "Incorrect parameters interpolation type");
	
	if(INTERPOLATION_TYPE == XF_INTERPOLATION_AREA)
		assert( (NPC == XF_NPPC1)  && "Supported Operation Mode for Area Interpolation is XF_NPPC1. XF_NPPC2, XF_NPPC4 and XF_NPPC8 are not supported ");
	else
		assert( ((NPC == XF_NPPC8) || (NPC == XF_NPPC4) || (NPC == XF_NPPC2) || (NPC == XF_NPPC1) )  && "Supported Operation Modes XF_NPPC8, XF_NPPC4, XF_NPPC2 and XF_NPPC1");

	if(NPC == XF_NPPC2)
		assert((((_src.cols & 1) == 0) && ((_dst.cols & 1) == 0)) && "Input and ouput image widths should be multiples of 2 in NPPC2 mode");
	if(NPC == XF_NPPC4)
		assert((((_src.cols & 3) == 0) && ((_dst.cols & 3) == 0)) && "Input and ouput image widths should be multiples of 4 in NPPC4 mode");
	if(NPC == XF_NPPC8)
		assert((((_src.cols & 7) == 0) && ((_dst.cols & 7) == 0)) && "Input and ouput image widths should be multiples of 8 in NPPC8 mode");

        	resizeNNBilinear<TYPE, SRC_ROWS, SRC_COLS, NPC, DST_ROWS, DST_COLS, INTERPOLATION_TYPE, MAX_DOWN_SCALE>(_src,_dst);
        }
};

@3togo
Copy link
Author

3togo commented Jul 3, 2019

@bgouthamb,
Your workaround is workable

Many thanks

Joe

@3togo 3togo closed this as completed Jul 3, 2019
@3togo
Copy link
Author

3togo commented Aug 30, 2019

Similar problem happens again when I call
xf::duplicateMat and xf::equalizeHist twice in a program
using a "class" is not workable this time. I guess it is because these function2 will call another xf:: function inside.
Any better workarounds?

@3togo 3togo reopened this Aug 30, 2019
@3togo
Copy link
Author

3togo commented Sep 28, 2019

any workaround?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants